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ABSTRACT 


Time  series  data  often  generate  disturbances  which  are 
time  dependent.  The  simplest  form  of  this  time  dependence  is 
first  order  autocorrelation.  Numerous  limited  information 
estimators  has  been  proposed  for  estimation  of 
autocor related  models.  A  chronological  list  of  these  methods 
would  include  Theil(1958),  Sargan ( 1 96 1 ) ,  Amemiya ( 1 966 ) , 

Fa i r (  1  970  ,  1  97 2  )  ,  Dhrymes,  Berner  and  Cummins (  1  974 ) ,  and 
Hatanaka (  1  976 )  .  These  methods  can  be  distinguished  by 
whether  they  employ  T  or  T- 1  observations,  the  reduced  form 
they  use,  the  way  they  estimate  the  autocorrelation 
coefficient  and  whether  they  allow  for  the  presence  of 
lagged  endogenous  variables  in  the  equation. 

Most  of  the  proposed  estimators  are  asymptotically 
efficient,  but  nothing  is  known  about  their  small  sample 
properties.  An  econometric  practitioner  invariably  deals 
with  small  samples.  Therefore,  it  is  of  great  interest  to 
inquire  into  the  small  sample  behaviour  of  these  estimators. 
The  purpose  of  this  thesis  is  to  investigate,  using  Monte 
Carlo  techniques,  whether  there  exist  significant 
differences  between  the  small  sample  performance  of  these 
estimators . 

For  reasons  of  cost  and  relevance,  this  study  has 
focused  on  the  limited  information  methods.  We  have 
considered  almost  all  the  limited  information  methods 
proposed  in  the  literature.  We  have  also  suggested  new 
methods  such  as  a  Generalized  Limited  Information  Maximum 


Likelihood  Estimator  and  different  ways  of  interpreting  or 
deriving  the  existing  ones.  We  have  exploited  the 
instrumental  variable  approach  to  the  Theil  estimator  to 
generate  modification  of  Theil's  estimator  as  well  as  the 
Brundy  and  Jorgenson  estimators. 

This  work  consists  of  two  major  parts.  The  first  part 
examines  the  small  sample  properties  of  the  limited 
information  estimators  designed  for  estimation  of 
autocor re  la ted  models  without  lagged  endogenous  variables. 
The  second  part  studies  the  small  sample  properties  of 
dynamic  simultaneous  aut oreggr ess i ve  model  estimators. 

We  have  explored  a  number  of  subsidiary  issues  such  as 
the  efficiency  gain  of  employing  the  first  observation, 
utilization  of  different  reduced  forms,  alternative 
specification  of  the  exogenous  variables  and  different  ways 
of  estimating  the  autocorrelation  coefficient.  The  relative 
importance  of  the  first  observation  in  the  single  equation 
context  was  investigated  recently  by  Maeshi ro ( 1 976 , 1 979 ) , 
Park  and  Mi tchell ( 1 980 ) ,  and  Beach  and  McKinnon ( 1 980 ) .  They 
found  instances  in  which  the  omission  of  the  first 
observation  caused  by  the  autoregressive  transformation 
resulted  in  substantial  loss  of  efficiency.  We  have  examined 
whether  their  results  hold  in  the  simultaneous  equation 
context . 

The  major  finding  of  this  study  is  that  the  Theil 
Generalized  Two  Stage  Least  Squared  estimator,  which  has 
been  completely  ignored  in  the  applied  literature, 


v 


unambiguously  dominates  all  other  estimators  in  the  static 
model.  In  the  case  of  the  dynamic  model  Theil's  estimator, 
Dhrymes'  Convergent  Two  Stage  Least  Square  estimator  and 
Hatanaka's  residual  adjusted  estimators  dominated  all  other 
estimators.  Fair's  estimator,  which  is  the  most  commonly 
used  method,  proved  to  be  extremely  inefficient. 
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I.  INTRODUCTION 


This  chapter  reviews  some  of  the  issues  and  problems 
involved  in  estimation  of  simultaneous  equation  models 
characterized  by  first  order  autocorrelation.  To  provide  a 
suitable  f ramewor k  for  the  subsequent  discussion  single 
equation  models  are  first  reviewed.  Then  the  case  of 
simultaneous  equation  systems  is  introduced.  Finally  the 
issues  and  problems  of  this  thesis  will  be  presented  and  the 
plan  of  the  work  will  be  outlined. 

A.  The  Single  Equation  Model 

Problems  associated  with  the  presence  of 
autocorrelation  in  the  context  of  the  single  equation  model 
have  been  extensively  studied.'  Many  of  the  methods 
appropriate  for  estimation  of  models  with  autocor related 
errors  are  two  step  procedures,  i.e.,  transformation  of  the 
autocor related  model  to  one  free  of  autocorrelation  and 
estimation  of  the  resultant  equation  using  Ordinary  Least 
Square  (OLS).  However,  these  methods  differ  in  their  choice 
of  transformation  matrix  and  also  in  the  way  they  estimate 
the  autocorrelation  coefficient. 

Generally,  they  can  be  regarded  as  special  cases  of  the 
Maximum  Likelihood  (ML)  procedure  developed  by  Beach  and 
Mackinnon ( 1 978 ) .  Consider  the  standard  linear  regression 
model  with  a  first  order  autoregressive  error  structure 
defined  by 

'For  a  recent  summary  of  much  of  this  work  see  Judge, 
Griffiths,  Hill  and  Lee(1980). 
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Y  =  X  0  +  U 

Ut  =  r  Ut . !  +  Et  t  =  1 , . . . ,T  (  1  ) 

E  t  ^ —  I N  (  0  ,  a  2  ) 

where  Y  is  a  T.1  vector  of  observations  on  the  dependent 
variable,  X  is  T.K  matrix  of  observations  on  the  exogenous 
variable,  U  and  E  are  a  T.1  vectors  of  the  error  terms,  /3  is 
a  K.1  vector  of  coefficients,  and  r  is  a  coefficient  of 
autocorrelation. 

The  concentrated  log-likelihood  function  corresponding 
to  the  model  ( 1 )  is 

L  =  const  +  1/2  log(1-r2)-  T/2  log  [  (  1  -r  2  )  ( Y  ^ -X  ,  '  /3 )  2  + 

L(Yt-Xt ' 0-rYt . ,+rXt. , ’ 0 ) 2 ]  (2) 

2 

where  X^  is  the  ith  row  of  the  X  matrix. 

The  first  order  conditions  with  respect  to  j3  and  r 
yield  the  maximum  likelihood  estimators  of  those 
coef  f ic ients  as 

p  =  (X’ X) ’ 1X’ Y 

where 

•  • 

X  =  PX  ;  Y  =  PY 

and  P  is  the  T.T  Pra i s-Winsten ( 1 954 )  transformation  matrix 
j  /(  1  -  r  2 )  0  ... 

-  r  1  0 

P  =  0  -r  1  0 

1 

•  •  • 

\ 

\ 

\  ...  r 

and  the  solution  for  the  autocorrelation  coefficient,  r,  is 
cubic  with  one  real  root  in  the  interval  (-1,1). 
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Generalized  Least  Squares(GLS)  estimation  of  the  model 

(1)  is  equivalent  to  the  maximization  of  the  log-likelihood 

(2)  while  omitting  the  first  term  (the  Jacobian  of  the 
transformation),  i.e.,  the  term  which  ensures  that  the 
stationarity  condition  holds.  The  GLS  estimators  of  p  and  r 
are  then  given  by 

P  =  (X'X) ‘ 1 X ’ Y 

A  TA  A  r  A 

r  =  2U,U,.,/  2  U\_, 

A 

where  Ut  is  the  estimated  vector  of  the  error  terms,  Ut . 

It  can  be  seen  that  GLS  procedure  yields  analytically 
the  same  formula  for  estimation  of  the  coefficient  P  as  the 
ML  method.  The  only  difference  is  that  the  GLS  procedure 
estimates  the  coefficient  of  autocorrelation  from  the 
Pra i s-Wi nsten  formula  whereas  in  the  ML  method  it  is  being 
estimated  via  the  solution  to  a  cubic  equation. 

The  third  alternative,  which  has  until  recently  been 
the  most  commonly  used  method  is  the  Cochrane-Orcutt ( 1 949 ) 
procedure.  This  method  is  equivalent  to  maximizing  the 
log-likelihood  function  (2)  with  respect  to  P  and  r  while 
omitting  the  terms  log(1-r2)  and  (1-r2)(Y1-X1’/3)2. 
Estimators  of  p  and  r  obtained  from  the  Cochrane-Orcut t 
procedure  are  given  by 
P  =  (X'X) " 1 X ’ Y 

a  TA  A  T  A 

r  =  ZUt  Ut.,  /LUh., 

Z 

where  X  =  QX  ,  Y  =  QY , 

and  Q  is  the  Cochrane-Orcut t ( CORC )  transformation  matrix 
which  is  the  same  as  the  matrix  P  without  its  first  row.  In 


' 
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this  procedure  the  coefficient  of  autocorrelation  is 
estimated  by  regressing  the  current  residual  on  its  one 
period  lag.  The  reason  for  the  popularity  of  this  method  is 
its  simplicity  since  it  is  nothing  more  than  applying  OLS  to 
the  transformed  equation 

Y,  -  r  Y  t  _  i  =  (Xt  -  rX  t  _  ,  )  *  /3  +  Et  t  =  2,...,T 

Note,  however,  that  unlike  the  ML  and  GLS  estimators,  this 
method  uses  only  T- 1  of  the  T  observations.  It  should  also 
be  noted  that  it  is  customary  to  iterate  the  GLS  and  CORC 
procedures  up  to  a  convergence  criteria. 

All  three  procedures  are  asymptotically  equivalent  and 
hence  share  the  ML  properties  of  consistency,  efficiency  and 
normality.  That  is  to  say,  the  effect  of  the  first 
observation  diminishes  as  the  sample  size  increases.  But  the 
omission  of  the  first  observation  may  have  serious  effects 
on  the  small  sample  performance  of  the  estimators.  We  now 
briefly  review  the  literature  pertaining  to  this  issue. 

The  first  major  work  dealing  with  the  effects  of 
autocorrelation  on  the  small  sample  properties  of  the 
estimators  in  the  single  equation  context  was  undertaken  by 
Rao  and  Gr i 1 iches ( 1 969 ) .  They  employed  a  standard  linear 
regression  model  as  defined  by  (1).  They  conducted  their 
experiment  with  a  sample  of  20  observations  using  an 
explanatory  variable  generated  by  a  first  order  Markov 
process,  i.e.,  Xt  =  XXt. ,  +  Vt  ,  with  X  ranging  from  -0.8  to 
0.99.  They  considered  several  two-step  regression  methods 
designed  for  estimation  of  model  (1)  and  compared  them  with 
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OLS .  Their  experiment  suggested  that2 

there  is  a  significant  gain  in  efficiency  to  be  had 
from  using  two  stage  estimation  procedures  for 
moderate  and  high  levels  of  serial  correlation  in 
the  residuals  (r  >  0.3)  and  very  little  loss  from 
using  such  methods  even  when  the  true  r  is  small. 

They  also  reported  some  gain  in  efficiency  for  those 

estimators  that  retain  the  first  observation  over  the  ones 

which  ignore  it. 

The  results  of  the  Rao  and  Griliches  study  were 

criticised  in  a  series  of  articles  by  Maeshi ro ( 1 976 , 1 979 ) 

who  showed  that ( 1 976 , p. 500 ) 

...contrary  to  its  intended  purpose,  the 
Cochrane-Orcut t  transformation  reduces  rather  than 
increases  the  efficiency  of  estimators  in  many 
cases... thus  the  various  estimation  methods  proposed 
by  Cochrane  and  Orcutt(1949)  and  the  method  proposed 
by  Dur bi n ( 1 960 , pp . 1 50 , 1 53 )  are  dubious.  In  fact, 
contrary  to  the  expectations  of  Cochrane  and  Orcutt 
and  Durbin,  ordinary  least  squares  estimators  may 
work  better  for  many  cases  of  trended  independent 
variables.  This  conclusion  also  leads  us  to  cast 
serious  doubt  about  the  validity  of  the  findings  of 
various  Monte  Carlo  studies  that  do  not  use  trended 
independent  variables  as  a  guide  in  the  choice  of 
estimators  when  an  independent  variable  contains  a 
trend . 

Maeshi ro ( 1 97 9 )  further  showed  that  the  gain  in  efficiency  of 
those  estimators  that  utilize  the  first  observation  is  much 
larger  than  what  was  suggested  by  Rao  and  Griliches,  if  the 
independent  variable  contains  a  moderate  trend. 

Maeshiro’s  results  were  analytically  confirmed  by 
Chipman ( 1 979 ) .  He  provided  a  formula  for  the  minimum  lower 
bound  of  efficiency  of  CORC  and  OLS  estimates  when  the 
explanatory  variable  is  trended.  He  showed  that  in  the  case 


2Rao  and  Griliches,  (1969),  pp. 267-68. 
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where  the  explanatory  variable  is  a  simple  linear  trend, 
"the  Cochrane-Orcutt  procedure  is  only  71  percent  as 
efficient  as  ordinary  least  squares".3 

A  heuristic  explanation  of  the  observed  pattern  of  the 

poor  performance  of  CORC  estimator  vis-a-vis  OLS  and  GLS  is 

given  by  Maeshi ro ( 1 979 , pp. 259-60 )  as  follows 

The  reason  why  a  substantial  gain  in  efficiency  can 
be  expected  by  retaining  the  first  observation  in 
the  case  of  trended  independent  variables  but  not 
necessarily  in  other  cases  is  twofold.  First,  the 
autoregressive  transformation  (with  positive  r)  of  a 
trended  variable  tends  to  result  in  a  new 
independent  variable  that  takes  more  similar  values 
and  hence  possesses  less  variability  than  the 
original  independent  variable.  Second,  the  weighted 
first  observation  (/f  - r~2  )X,  ,  on  the  other  hand, 
tends  to  take  a  value  that  is  quite  different  from 
the  values  of  the  new  independent  variable  created 
by  the  autoregressive  transformation.  As  a  result, 
the  marginal  contribution  of  the  weighted 
observation  to  the  increase  in  the  variability  of 
the  new  independent  variable  and,  therefore,  to  the 
increase  in  the  efficiency  of  the  estimator,  can  be 
substant ial . . . . For  nontrended  independent  variables, 
however,  the  autoregressive  transformation  of  a 
variable  does  not  necessarily  reduce  (in  fact,  can 
increase)  variability  and  the  weighted  first 
observation  is  not  necessarily  very  different  from 
the  values  of  the  autoregressive  transformed 
variable . 

Park  and  Mi tchell ( 1 980 )  also  confirmed  the  findings  of 
Maeshiro.  They  also  examined  the  performance  of  different 
estimators  in  hypothesis  testing.  They  found 
that ( 1 980 , p . 1 85 ) ,  4 

none  of  the  feasible  estimators  performs  well  in 
hypothesis  testing;  all  seriously  understimate 
standard  errors,  making  estimated  coefficients 
appear  to  be  much  more  significant  than  they 
actually  are. 


3Chipman , ( 1 979 ) ,  p . 1 17. 

4  Park  and  Mitchell,  (1980),  p.185. 
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Beach  and  Mac k i nnon ( 1 978 )  also  compared  the  small 
sample  performance  of  the  Cochrane-Orcut t  and  full 
information  maximum  likelihood  estimators  ( F I ML ) .  They  found 
that  FI ML  which  incorporates  all  the  observations  always 
performed  better  than  the  CORC  procedure  and  this  difference 
was  specially  noticeable  when  the  exogenous  variable  was 
trended . 

Tay lor ( 1 98  1  ) ,  using  analytical  approximations,  showed 
that  performance  of  the  single  equation  estimators  is 
sensitive  to  the  way  the  explanatory  variable  is  specified. 
He  showed  that  the  value  of  the  first  observation  becomes 
extremely  important  if  the  explanatory  variable  is 
non-stochastic  and  trended  and  its  trend  coefficient  happens 
to  be  close  to  the  autocorrelation  coefficient  of  the  error 
term. 5 

Finally,  Doran  and  Gr i f i tts ( 1 982 )  conducted  experiments 
using  the  seemingly  unrelated  regression  model  with  first 
order  autoregressive  disturbances.  They  were  particularly 
interested  in  the  relative  efficiency  of  using  the  first 
observation.  They  found  that  there  does  not  exist  a 
significant  difference  between  the  estimators  that  utilize  T 
and  the  ones  that  employ  T- 1  observations. 


5We  shall  review  his  findings  in  more  detail  later. 
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B.  Simultaneous  Equation  Models 

Economic  models  usually  involve  a  set  of  interrelated 
relationships  aimed  at  explanation  of  certain  economic 
variables.  In  general,  they  can  be  compactly  written  as 

Y  B  +  X  T  =  U  (3) 


where  Y  is  the  T.G  matrix  of  observations  on  the  endogenous 
variables,  X  is  the  T.K  matrix  of  observations  on  the 


predetermined  variables,  B  is  the  G.G  matrix  of  structural 
coefficients  associated  with  the  endogenous  variables,  T  is 
the  K.G  matrix  of  coefficients  of  the  predetermined 
variables,  and  U  is  the  T.G  matrix  of  disturbances. 

For  estimation  of  the  system  (3),  the  following 
assumptions  about  the  stochastic  disturbances  are  usually 


made  6 

U  t  i  - *  N  (  0  ,  a  2  )  ,  i  =  1  ,  .  .  .  ,  G 

E  (  U  t  i  U  s  j  )  =  0  t ,  s  =  1  ,  .  .  .  ,  T 

t  4  s 

E  (  U  t  j  U  t  j  )  =  (7jj  i  r  j  —  1,...,G 

In  matrix  notation  these  assumptions  become 

U,  - -  N(0,n)  (4) 

E (  U, '  U s  )  =  0  t  f  s  (5) 


where  Uj  is  the  ith  row  of  the  U  matrix,  and 


j  °  1  1  r  • 

Q,  = 


o  id 


o  &  1  r  •  •  •  r  G(r(T  I 


(6) 


6  We  note  here  that  in  the  subsequent  development  E  has  been 
used  to  signify  an  error  term.  However,  in  some  instances  we 
have  used  E  to  denote  expectation.  The  distinction  between 
these  usages  is,  however,  clear  from  the  context. 
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Given  equations  (4),  (5),  and  (6)  and  the  assumption  of 

ident i f iabili ty ,  system  (3)  can  be  consistently  estimated 
using  full  or  limited  information  methods.  Methods 
appropriate  for  the  estimation  of  this  type  of  model  are  the 
ones  which  take  into  account  the  simultaneity  created  by  the 
presence  of  the  current  endogenous  variables  among  the 
explanatory  variables  of  each  structural  equation. 

However,  one  of  the  crucial  assumptions  made  about  the 
properties  of  the  stochastic  disturbances  is  not  usually 
satisfied.  As  in  the  single  equation  case,  time  series  data 
invariably  generate  disturbances  which  are  time  dependent. 
The  simplest  form  of  this  time  dependence  is  a  first  order 
vector  autoregressive  process  which  replaces  assumption  (5) 
by 

U  t  =  U  t - i  R  +  E,  (7) 

where  Et  is  a  random  vector  of  G  components,  i.e., 

Etx  =  (e1t  ,  .  .  .  ,  e  hx  )  ,  t  =  1  ,  .  .  .  , T ;  the  Et  for  any  two  different 
time  periods  being  independently  normally  distributed  with 
mean  zero  and  a  constant  variance-covariance  0,  and  R  is  a 
G.G  matrix  of  autocorrelation  coefficients. 

Following  the  practice  in  Amemiya  (1961),  Fair(1972), 
and  Dhrymes ( 1 974 ) ,  we  assume  that  R  is  a  diagonal  matrix, 
i.e. 

R=diag (r1,r2,...,r&) 

where 

—  1  <  r  j  1  i  —  1  ,  .  •  •  ,  G 
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It  is  important  to  note  that  the  assumption  of 
diagonality  of  R  does  not  imply  independence  of  the  error 
terms  across  equations.  One  can  easily  show  that 

E  (  U  ,  '  U  t  )=f2  =  R'QR  +  0 

The  presence  of  autocorrelation  poses  additional 
problems  for  the  estimation  of  the  simultaneous  equation 
system  formulated  in  (3)  and  (7).  In  general,  the 
application  of  the  classical  linear  simultaneous  equation 
techniques  which  ignore  the  autocorrelation  results  in 
inefficient,  and  in  some  cases(i.e.,  lagged  endogenous 
variables)  inconsistent  estimates  of  the  structural 
coef  f ic ients . 

Techniques  appropriate  for  estimation  of  systems 
characterized  by  autocorrelation  are  not  adequately 
addressed  in  the  textbooks.  Only  a  limited  number  of 
textbooks  have  tackled  this  problem  and  even  some  of  those 
have  treated  it  inadequately.7  However,  there  has  been  a 
series  of  articles  in  the  journal  literature  proposing 
different  methods  for  estimation  of  medels  with 
autocorrelation.  A  systematic  review  of  these  methods  will 
be  presented  in  later  chapters.  However,  some  of  the  general 
problems  and  issues  involved  in  estimation  of  the 
simultaneous  equation  models  with  autocorrelation  is 
discussed  in  the  next  section.  Then,  the  structure  of  this 
thesis  and  the  issues  that  will  be  studied  are  presented. 


7See  Kmen ta ( 1 97 1 ) ,  Pindyck  and  Rubinf eld( 1 98 1 ) ,  Klein(1974), 
Kelejian  and  Oates(1981)  and  Stewart  and  Wall i s ( 1 98 1 ) . 


C.  Estimation  of  Simultaneous  Equation  System  in  the 


Presence  of  First  Order  Autocorrelation 

The  model  described  by  (3)  and  (7),  namely 
Y  B  +  X  T  =  U 


(8) 


Ut  =  Ut.,  R  +  E  t 


(9) 


can  be  written  as  a  restricted  transformed  structure  by 
eliminating  (9)  to  obtain 


YB+XT-Y.!  BR-X.i  T  R  =  E 


(10) 


The  system  represented  in  (8)  and  (9)  can  be  estimated  using 
system  or  limited  information  techniques.  Estimating  via 
limited  information  methods  ignores  three  sources  of 
information  about  all  equations  except  the  one  which  is 
being  estimated,  namely, 

1-  Any  across  equation  over- ident i f y ing  restrictions. 

2-  Across  equation  covariance  structures. 

3-  The  restrictions  caused  by  the  autoregressive 
transformation  in  (10). 

However,  on  the  credit  side,  the  limited  information 
methods  are  simpler  to  implement  and,  moreover,  they  prevent 
any  specification  error  ocurring  in  one  equation  from 
affecting  the  estimation  of  the  other  equations. 

In  practice  limited  information  methods  are  most  often 
used  and  these  are  the  set  of  estimators  examined  in  this 
thesis.  Unfortunately,  there  is  no  unified  systematic 
treatment  of  the  methods  for  estimation  of  system  (8)  and 
(9)  in  the  literature.  A  large  number  of  alternative 
estimators  has  been  proposed.  These  estimators,  often 
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confusing,  can  be  distinguished  by  whether  they  use  T  or  T- 1 
observations,  by  the  structure  of  the  reduced  form  they 
employ  and  by  the  way  they  estimate  the  autocorrelation 
coefficient.  We  now  consider  these  issues. 

In  practice  the  coefficient  of  autocorrelation  is  never 
known.  Usually,  a  consistent  estimate  of  it  based  on  the 
estimated  residuals  from  a  first-stage  consistent  estimator 
is  employed.  In  the  single  equation  context,  there  are  a 
number  of  alternative  methods  proposed  for  estimation  of  the 
autocorrelation  coefficient.  We  shall  review  them  here  and 
expect  that  their  properties  carry  over  to  the  simultaneous 
system  as  well.  Basically,  these  alternative  estimators  are 
obtained  by  maximizing  the  log  likelihood  (2)  with  respect 
to  r  conditional  on  a  given  value  of  the  vector  p.  If  we 
ignore  the  Jacobian  of  the  transformation  in  (2),  the  above 
procedure  amounts  to  minimizing  the  sum  of  squared  residuals 
given  in  the  last  term  of  (2)  with  respect  to  r  for  a  given 
value  of  p.  This  means  that  for  those  estimators  which  use 
T- 1  observations,  r  will  be  estimated  as  the  value  which 
minimizes  the  sum  of  squared  T- 1  residuals  as 

T  A  T  A  A 

SR,  =  L  Et 2  =  L(  Ut  -rUt _ 1 ) 2 
A 

since  U ,  =  r  U  t - i  +  E  t 
To  minimize  SRi  with  respect  to  r,>  we  have 
^  S  R  i  /  ^  r  =  0 
which  yields 

a  r  ^  a  .  T  A 

r,  =  Z  Ut  U,.,  /  L  U2t-i 

l  l 

which  is  the  familiar  Cochrane-Orcut t  formula. 
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The  same  process  applies  to  the  estimators  which 
utilize  all  T  observations.  The  r  which  minimizes  the  sum  of 
squared  T  residuals  can  be  estimated  by  partially 
differentiating  SR2  (defined  below)  with  respect  to  r  and 
setting  the  first  order  condition  equal  to  zero 

SR2  =  I  E  t  2  =  (  1  -  r  2  )  U  1  2  +  L(  Ut-rUt_.,)2 

^)SR2/  ^  r  =  0 
yields 

A  r  A  A  T  A 

r2  =  L  Ut  Ut.!  /  L  U2,^ 

which  is  the  Pra i s-Winsten  formula  for  estimating  the 
autocorrelation  coefficient. 

•  A  A 

Obviously  r2  tends  to  be  greater  than  r, .  Theil(1971) 
proposed  an  alternative  estimator  that  incorporates  a 
degrees  of  freedom  correction  which  would  result  in  a 

.  A  /V 

smaller  estimate  of  r  than  r2  and  r, .  However,  empirical 
research  (see  for  instance  the  essay  by  Hildreth  and  Dent  in 
Sel lekaer t s (  1  974 ) )  shows  that  r2  and  r,  tend  to  have  a 
downward  bias  as  they  stand.  In  the  light  of  this  fact,  use 
of  the  Theil  estimator  does  not  seem  appropriate.  In  the 
present  work,  we  will  follow  Park  and  Mi tchell ( 1 980 )  and  use 
the  Pra i s-Wi nsten  formula  for  estimators  that  use  all  T 
observations  and  the  CORC  method  for  those  which  utilize  T- 1 
observations.  We  emphasize  again  that  these  estimators  are 
derived  from  the  single  equation  case  and  it  is  a  matter  for 
empirical  determination  whether  these  differences  are 
important  in  the  simultaneous  equation  case. 
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With  regard  to  the  employment  of  the  reduced  form 
structure,  there  are  two  possibilities.  The  reduced  form 
system  can  be  written  as 

Y  =  -X  T  B* 1  +  U  B‘ 1  (11) 

or  in  the  form  called  "the  augmented  reduced  form",  that  is 

Y  =  -X  T  B'1  +  Y_!  B  R  B"1  +  X_!  T  R  B  *  1  +  E  B*  1  (12) 

which  is  obtained  by  substituting  for  U  in  (3). 

In  the  absence  of  lagged  endogenous  variables,  the 
reduced  form  (11)  can  be  utilized  to  provide  consistent 
estimates  of  the  endogenous  variables  appearing  on  the  right 
hand  side  of  any  structural  equation.  But  if  the  system 
contains  lagged  endogenous  variables,  the  reduced  form  (12) 
should  be  used.  However,  the  distinction  has  not  always  been 
clearly  made  in  the  literature. 

D.  Monte  Carlo  Evidence 

Most  of  the  estimators  that  have  been  proposed  have 
yielded  up  large  sample  properties,  but  the  small  sample 
properties  of  these  estimators  has  not  been  considered.  This 
question  can  be  answered  by  theoretical  analysis  or  by  Monte 
Carlo  experiments.  Even  though  there  has  been  some  attempt 
to  arrive  at  the  theoretical  derivations  of  the  small  sample 
distributions  of  different  estimators8  ,  their  derivation  is 

8Nagar (  1  959 ) ,  Basmann ( 1 96 1 ) ,  Richardson (  1  968  ) ,  Sawa(1969), 
Sargan  and  Mi kha i 1 ( 1 97 1 ) ,  Kadane ( 1 97 1 ) ,  Phi 11 ips ( 1 980 ) ,  and 
Marino(1982)  among  others  have  derived  the  finite  sample 
distributions  and  the  first  few  moments  (where  they  exist) 
for  K-class  estimators  under  the  assumptions  of 
non-stochastic  exogenous  variables  in  the  absence  of 
autocorrelation . 
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often  time  consuming,  difficult  and  sometimes  impossible. 

The  difficulties  involved  in  the  theoretical  derivation  are 
the  main  reason  for  the  attractiveness  of  the  Monte  Carlo 
methods.5  Monte  Carlo  experiments  tend  to  have  lower  labor 
cost  and  their  comparative  advantage  increases  with  the 
complexity  of  the  problem.10 

Results  of  works  by  Phi 11 ips ( 1 977 )  and  Maasoumi ( 1 980 ) 
among  others  suggest  that  the  discrepancies  between 
asymptotic  and  finite  sample  behavior  of  the  estimators  are 
parameter  dependent  and  we  cannot,  without  qualification, 
postulate  the  asymptotic  behavior  of  the  estimators  from  the 
small  sample  results.  However,  as  Hendry(1973,  1974)  has 
suggested  we  can  use  the  asymptotic  properties  as  a  guide  to 
the  small  sample  performance  of  different  estimators. 

After  an  extensive  review  of  the  literature,  it  was 
found  that  there  exist  only  a  few  studies  which  have  tried 
to  investigate  the  small  sample  properties  of  the 
simultaneous  equation  estimators  in  the  presence  of 
autocorrelation.  Schink  and  Chiu(1966)  examined  the  effects 
of  autocorrelation  on  the  performance  of  the  ordinary  least 
squares,  limited  information  single  equation  (LISE)  and  two 
stage  least  squares  estimators.  They  concluded  that 
autocorrelation  had  no  significant,  effect  on  the  performance 
of  LISE  and  2SLS ,  while  it  affects  negatively  the  OLS 
estimator.  Hurd(1972)  examined  the  performance  of  OLS,  CORC , 

’See  Summers , 1 965 . 

10For  a  survey  of  these  studies  see  Johnston ( 1 972 ) ,  Sowey 
(1973). 
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2SLS  and  a  modified  version  of  2SLS  which  in  addition  to 
simultaneity  also  corrected  for  the  autocorrelation.11  He 
found  that  techniques  which  correct  only  the  autocorrelation 
perform  much  better  than  those  which  correct  for  the 
simultaneity.  Goldfeld  and  Quandt (1972)  examined  the  small 
sample  performance  of  OLS ,  2SLS  and  various  versions  of 
FIML.  They  found  that  FIML  when  it  takes  into  account  both 
autocorrelation  and  simultaneity  performs  better  than  OLS, 
2SLS  and  different  versions  of  FIML  that  ignore  either  of 
the  simultaneity  or  autocorrelation  problems. 

Hendry  and  Har r i son ( 1 974 )  investigated  the  main  sources 
of  small  sample  incons i stenc ies  of  OLS  and  2SLS  when  they 
are  applied  to  an  structural  equation  with  autocorrelation. 
Hendry  and  Srba(1977)  investigated  the  performance  of  the 
OLS,  instrumental  variable  est imator ( I V)  and  their 
generalization  which  allowed  for  the  presence  of 
autocorrelation  (denoted  as  ALS  and  AIV) ,  and  ordinary  2SLS 
estimators  using  a  dynamic  autoregressive  model.  They 
concluded  that  asymptotic  distribution  theory  can  be  useful 
in  assessing  the  small  sample  behavior  of  different 
estimators.  Finally,  Wang  and  Fuller ( 1 982 ) ,  using  a  dynamic 
autoregressive  model,  compared  the  performance  of  a 
autoregressive  two-stage  least  squares  estimator  with  two 
full  information  estimators  suggested  by  Wang  and 
Fuller ( 1975)  and  independently  derived  by  Ha tana ka ( 1 97 6 ) . 

1 ’It  is  worth  noting  that  the  modified  2SLS  estimator 
considered  by  Hurd  is  equivalent  to  the  modified  Theil^ 
estimator  analysed  in  chapters  II  and  III  of  this  thesis. 
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All  three  estimators  shared  some  features  of  the  two-step 
Gauss-Newton  procedure  and  of  Aitken  generalized  least 
squares.  They  found  that(1982,  p.140)  "Monte  Carlo  results 
for  the  autoregressive  estimators  are  generally  consistent 
with  the  large  sample  properties  of  those  estimators." 

However,  the  scope  of  these  works  on  the  small  sample 
properties  of  the  simultaneous  equation  estimators  in  the 
presence  of  autocorrelation  is  very  limited.  Nothing  is 
known  about  the  small  sample  properties  of  most  of  the 
estimators  such  as  Theil's  Generalized  Two  Stage  Least 
Squares (G2SLS ) ,  Brundy  and  Jorgenson's  instrumental  variable 
estimator,  Amemiya's  autoregressive  Limited  Information 
Maximum  Likelihood  (ALIML)  estimator,  Fair's  estimator, 
Hatanaka's  two  step  methods,  and  Dhrymes'  two  step 
procedures.  The  relative  lack  of  knowledge  in  this  important 
area  of  econometrics  is  the  primary  reason  for  undertaking 
this  thesis.  Our  objective  is  to  investigate  the  small 
sample  performance  of  all  major  simultaneous  equation 
estimators  designed  for  estimation  of  the  models 
characterized  by  autocorrelation. 

To  pursue  this  task,  we  shall  concentrate  on  the 
limited  information  methods.  This  choice  is  primarily  due  to 
cost  considerations  but  the  fact  that  these  methods  are  of 
more  practical  importance  than  the  full  information  ones  is 
also  a  major  justification  of  our  approach.  We  have  divided 
our  enquiry  into  two  major  parts.  The  first  part 
investigates  the  small  sample  properties  of  the  limited 
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information  methods  of  estimation  in  the  absence  of  lagged 
endogenous  variables.  The  primary  focus  of  this  part  is  the 
investigation  of  the  effects  of  utilizing  T  or  T- 1 
observations,  employing  different  reduced  forms,  and  using 
different  methods  for  the  estimation  of  the  autocorrelation 
coefficient.  This  part  also  investigates  the  effects  of 
different  spec i f icat ions  of  the  exogenous  variables  on  the 
small  sample  properties  of  the  estimators.  The  second  part 
of  this  work  studies  the  small  sample  properties  of  the 
limited  information  estimators  in  the  presence  of  lagged 
endogenous  variables. 


II.  LIMITED  INFORMATION  METHODS  OF  ESTIMATION  IN  THE  ABSENCE 


OF  LAGGED  ENDOGENOUS  VARIABLES 

This  chapter  presents  the  limited  information  methods 
of  estimation  of  autoc or  related  models  without  lagged 
endogenous  variables.  These  estimators  can  be  categorized  in 
a  variety  of  ways.  They  can  be  classified  according  to 
whether  they  employ  T  or  T- 1  observations.  The  number  of 
observations  utilized  has  a  direct  bearing  on  the  type  of 
reduced  form  that  they  can  employ  and  the  way  they  estimate 
the  autocorrelation  coefficient.  They  can  be  distiguished  on 
the  basis  of  the  instruments  or  the  reduced  form  that  they 
employ.  For  the  purpose  of  exposition  we  classify  them  into 
the  following  two  categories:  (i)  Those  that  are  generated 
by  the  Limited  Information  Maximum  Likelihood  ( LI ML ) 
approach  and  (ii)  those  that  follow  the  Theil  Generalized 
Two  Stage  Least  Square  (G2SLS )  approach.  The  emphasis  in 
this  chapter  will  be  on  the  computational  aspects  of  the 
estimators.  In  the  next  chapter  the  asymptotic  properties  of 
the  estimators  will  be  considered  paying  particular 
attention  to  the  question  of  asymptotic  efficiency. 

A.  The  Variants  of  the  LIML  Approach 

Autoregress i ve  Lim i ted  I n for mat i on  Max i mum  Like 1 / hood 
(ALIML) 

Without  loss  of  generality,  we  focus  on  the  estimation 
of  the  first  equation  of  the  system  considered  in  Chapter  I. 
We  can  write  this  equation  (after  the  normalization  i  i =  1  ) 


19 


20 


as 

Y i  =  Y,  0,  +  X,  7 i  +  u,  (1) 

where 

u  i  =  r  u1(_-i  +  e  i  ,  (2) 

yih^X,  and  u,  are  sub-matrices  of  Y,  X,  and  U,  and  where 
y !  is  a  T. 1  vector  of  values  of  the  dependent  variable,  Y, 
is  an  T.Gi  matrix  of  endogenous  variables  other  than  the 
first  one  included  in  the  first  equation,  X,  is  a  T.Kt 
matrix  of  exogenous  variables  included  in  the  first 
equation,  u,  and  e,  are  T.1  vectors  of  disturbance  terms,  r 
is  the  (1,1)  element  in  R,  and  /3  and  are  G  t  .  1  and  K,.1 
vectors  of  coefficients  corresponding  to  the  relevant 
elements  of  B  and  F,  respectively. 

Substituting  (2)  in  (1)  yields 


yi-r 

y  i 

,  -  i  - 

( Y  ,  -  r  Y  ,  ,  _  i  )  1  + 

( X  !  r  X  i  t  _  i  )  7  i  +  e  i 

(3) 

Using  the 

notation  Y,  =  Yj-r  Ys 

_  i  , .  etc,  ( 3 )  becomes 

y  i  = 

Y  i 

|3  i  +  X,  7 1  +  e  i 

(4) 

or 

y  i  = 

Zi 

6 1  +  u 

(5) 

The  problem  of  estimating  (5)  was  first  addressed  by 
Sargan ( 1 96  1  )  .  He  proposed  a  Limited  Information  Maximum 
likelihood  method  for  the  estimation  of  this  equation.  His 
estimator  was  later  modified  by  Amemiya ( 1 966 )  to  arrive  at 
some  two  step  alternatives.  The  concentrated  likelihood 
function  corresponding  to  the  system  (1)  is  given  by 


■ 


. 
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L,  =  K,+  1/2  log  ( B 1  '  W  B,)-  1/2  logU/e) 


(6) 


where 


W  =  Y^'  )Y,* 

$  =  (X  ,Y_ , ,  X. , ) 

Yi*  =  ( y i  ,  y, ) 

where  B,  is  the  first  row  of  the  B  matrix. 

Using  Sargan's  est imator ( ALIML ) ,  the  estimates  of  the 
coefficients  of  the  system  (1)  can  be  obtained  by  partially 
differentiating  (6)  with  respect  to  Pi,  7,, and  r  and  solving 
the  resultant  first  order  conditions.  The  solution  for  p: 
and  7 1  is 


(7) 


where  X  is  the  smallest  root  of  the  determinantal  equation 


| w,  -  X  wl =  0 


and 


W,  =  Y ! * ' (I-X, (X, 'X, ) ” 1 X 1  ) Y , * 


The  autocorrelation  coefficient  is  estimated  as 

A  A  A  .  .  A 

r  =  ( u  t  u  t  _  1  )  /  u  t  -  1  2 


(8) 


where 


A 

u 


y  1  -  y  1 


A  modified  version  of  the  above  estimator  called  Sargan  2SLS 
( S2SLS )  was  proposed  by  Amemiya ( 1 966 ) .  S2SLS  is  obtained  by 
setting  X=1  which  gives 
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( 


Y  i  '  X 


X,  'X 


Y, '$($'$) - 


The  S2SLS  estimator  of  r  is  identical  to  that  of  ALIML  which 
is  given  in  (8).  S2SLS  is  related  to  the  ALIML  estimator  in 
the  same  way  that  2SLS  is  related  to  LIML.  The  S2SLS 
estimator  can  be  regarded  as  an  instrumental  variable 
estimator  which  uses  all  the  X's,  and  one  period  lags  of  the 
X's  and  the  Y's  as  instruments  for  the  right  hand  side 
endogenous  variables.12  These  instruments  are  all  the 
variables  entering  the  augmented  reduced  form  for  Yt (Chapter 
I ,  equation  (12)). 

The  main  drawback  of  Sargan’s  method  is  the  large 
number  of  instruments  it  uses.  Even  in  a  moderate  size 
model,  there  is  a  possibility  that  the  number  of  instruments 
is  greater  than  the  number  of  observations.  Moreover,  the 
presence  of  all  the  X's  and  their  lagged  values  in  the  list 
of  instruments  can  cause  severe  mult icoll inear ity  that  in 
turn  will  result  in  a  poor  estimates.13  In  the  absence  of 
lagged  endogenous  variables  a  consistent  estimate  of  Y ,  can 
be  obtained  from  the  ordinary  reduced  form  (Chapter  I , 
equation  (11)).  This  can  clearly  reduce  the  number  of 
instruments  in  the  initial  stage  and  thus  may  decrease  the 
12See  Sargan(1959,  1961). 

1 3  Poor  in  the  sense  that  the  fitted  values  of  Y,  even  though 
consistent,  will  have  relatively  large  variances.  Whether 
this  affects  the  small  sample  properties  of  the  estimator  or 
not  is  a  question  that  will  be  empirically  investigated 
later . 
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small  sample  bias  of  the  estimator.14  However,  its  effect  on 
the  small  sample  performance  of  the  estimators  must  be 
investigated . 

To  tackle  the  potential  degrees  of  freedom  problem, 
especially  for  large  models,  Fair  suggested  a  number  of 
alternative  estimators  which  use  fewer  instruments  than 
S2SLS.  In  the  following  we  shall  discuss  the  motivation  and 
the  steps  required  in  estimation  of  his  most  clearly- 
enunciated  procedure  which  is  also  incorporated  in  the  TSP 
package.15  The  asymptotic  properties  of  this  estimator  will 
be  discussed  in  the  next  chapter. 


Fair  Estimator 

Nagar(1959)  showed  that  the  small  sample  bias  of  the 
K-class  estimators  is  positively  related  to  the  number  of 
predetermined  (instrumental)  variables  in  excess  of  the 
number  of  coefficients  to  be  estimated.  Using  this  theorem 
Fair(1970)  showed  that  the  S2SLS  which  uses  instruments  in 
excess  of  the  ones  which  appear  in  the  reduced  form  of  the 
estimated  equation  increases  the  small  sample  bias,  to  the 
order  T" 1 ,  of  that  estimator.16 


1 4See  Nagar(1959).  We  shall  discuss  this  point  shortly. 

15  Note  that  the  Fair  estimator  in  TSP  uses  the  Cubic 
formula  developed  by  Beach  and  Mackinnon  to  calculate  the 
autocorrelation  coefficient.  However,  as  we  shall 
demonstrate  later,  this  does  not  have  a  significant  effect 
on  the  estimated  coefficients. 

16  Fair  showed  that  even  when  the  augmented  reduced  form  is 
being  used,  the  number  of  instruments  employed  in  S2SLS  can 
be  drastically  reduced.  He  (1970,  p . 5 1 1 )  demostrated  that 
the  explanatory  variables  that  are  excluded  (Xx)  from  the 
structural  equation  under  consideration,  appear  in  the 
augmented  reduced  form  as  (Xfx  -riXi(_1x).  This  means  that 
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He  proposed  an  alternative  estimator17  which  is 
asymptotically  less  efficient  than  S2SLS  but  uses  fewer 
instruments  and  thus  may  have  smaller  small  sample  bias. 

The  structure  of  Fair's  estimator  is  identical  to  S2SLS 
except  that  it  uses  a  subset  of  the  instruments  used  by 
S2SLS ,  i.e.,  Xm— (Xi,Xi(_i,yi(_i,Yi(_i)  where  X  i  ,  y i ,  and  Y i 
are  defined  above. 

Given  the  autocorrelation  coefficient  r,  we  can  write 
the  equation  under  consideration  as 

Yi-r  y  i  ,  -  i  =  (Yi-r  Y !  ,  _  )  /3  -i  +  (X,-r  X,  ..Jt,  +  e, 
or 

y1=Y1/31+X17,+e1  (10) 

The  set  of  instruments  used  by  S2SLS  was  Xs=  (X,  X_  , 

Y_ i )  which  are  the  variables  appearing  in  the  augmented 
reduced  form.  Let 


1 6 ( cont ' d ) X i x  and  its  one  period  lag  do  not  need  to  appear 
as  two  independent  instruments;  rather  their  transformed 
form  appears  as  one  instrument  in  the  set  of  explanatory 
variables  describing  the  endogenous  variables  appearing  on 
the  right  hand  side  of  the  equation  under  consideration. 
Using  the  transformed  variables  can  greatly  reduce  the 
number  of  instruments  used  for  the  first  stage  estimation  in 
S2SLS  without  any  loss  of  efficiency  (Fair, 1970,  pp.510, 
511).  However,  this  method  which  uses  the  transformed 
excluded  exogenous  variables  as  instruments  requires  the 
knowledge  of  all  the  structural  autocorrelation 
coefficients.  This  in  turn  requires  the  estimation  of  a 
complete  system  prior  to  the  estimation  of  the  structural 
equation  under  consideration  which  would  not  be  practical 
especially  in  large  models.  Therefore,  in  this  study  we  have 
only  considered  Fair's  most  widely  known  method  (outlined  in 
this  section)  which  is  also  used  in  TSP  package  with  a 
slight  modification. 

17  Note  that  Fair's  estimator  was  designed  in  the  first 
instance  to  deal  with  the  lagged  endogenous  case.  We  are 
including  it  here  because  Fair's  procedure  is  widely  used 
for  static  autoregressive  model  estimation. 
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Xm  =  X  s  (  S  i  ,  S  2  ,  S3)  =  (x  1  ,  X ,  ,  _  ,  ,  y  1  ,  -  1  Y  ,  ,  _  ,  ) 
where  S,,  S2,  and  S3  are  appropriate  selection  matrices 
which  generate  the  minimum  set  of  instruments  needed  for  the 
Fair  estimator, 

_  a 

Fair  replaces  Y,=Y,-r  Y ,  ,  _  ,  in  (10)  by  Y1  -r  Y,,_, 

A 

where  Y ,  is  obtained  from  a  regression  of  Y,  on  Xm ,  i.e., 

Y,  =  Xm(Xm'Xm)' ,Xm'Y, 
so  that 

Y,  -  rY,  f  _ 1  =  Xm ( Xm ’ Xm ) *  1 Xm ' Y ,  -  rY,,.,  (11) 

The  right  hand  side  of  (11)  can  be  written,  using  the 
notation  Y , , _ , =Xm  S3,_,f  as 

Y,  -  rY,,.,  =  Xm (Xm ' Xm ) *  1 Xm ' Y ,  -  rXm ( Xm ’ Xm ) ' 1 Xm ' XmS 3 , . , 

=  Xm ( Xm ' Xm ) " 1 Xm ' (Y,  -  rY, , . , ) 

therefore 

a  _  Cl 

Y,  -  rY,,.,  =  Xm ( Xm ' Xm ) ' 1 Xm ' Y ,  =  Y,m  (12) 

Using  (12)  in  place  of  Y,  in  (10)  and  applying  OLS 


gives 


/  A 

01 

A  a  A 

Y  ,  m  '  Y  ,  m  Y  ,  m  ’  X  , 

-  1 

A 

Y,m'y,  \ 

A 

\ 

7i  J 

X 1  '  Y  1  m  X,'X,  ) 

x,’y,  / 

which  can  be  further  simplified  as 


/A  \ 

//3,' 

'y, ' Xm ( Xm ’ Xm ) " 1 Xm ' Y 1 

Y, 

'X, 

Xi  'Y, 

X, 

’  x , 

Y, 'Xm(Xm'Xm) *  1 Xm ' y 1  \ 
Xi ' y  1  / 


(13) 


(14) 


Which  is  exactly  the  same  as  S2SLS  with  Xs  replaced  by  Xm . 
Equation  (14)  shows  that  the  Fair  and  S2SLS  are  identical  in 
structure.  This  is  of  course  what  one  would  expect  since 
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S2SLS  regresses  Y ,  on  Xs  whereas  Fair  regresses  Y ,  on  Xm , 
i.e.,  Fair  chooses  a  sub-set  of  instruments  from  Xs  and 
because  it  chooses  a  sub-set  of  instruments  it  will  also  be 
inefficient  relative  to  S2SLS. 

Fair  Brundy  and  Jorgenson  Estimator 

The  primary  purpose  of  the  Fair  estimator  is  to 
overcome  the  potential  degrees  of  freedom  problem  in  the 
estimation  of  the  reduced  form  by  using  a  sub-set  of  the 
variables  appearing  in  the  augmented  reduced  form  as 
instruments.  An  alternative  approach  to  solve  this  problem 
in  the  estimation  of  simultaneous  equation  systems  was 
proposed  by  Brundy  and  Jorgenson ( 1 97 1 )  and  was  modified  by 
Fair(1972),  to  take  into  account  the  autoregressive 
properties  of  the  error  terms.  This  estimator  does  not 
require  reduced  form  estimation  at  the  initial  stage.  At  the 
first  stage  it  takes  into  account  the  restrictions  imposed 
on  the  structural  equation  of  the  system  other  than  the  one 
under  consideration.  Fair's  version  of  Brundy- Jorgenson 
estimator  is  constructed  as  follows: 

Write  the  augmented  reduced  form 

Y  =  -X  T  B '  1  +  Y .  B  R  B’  1  +  X^T  R  B  *  1  +  E  B'  1  (15) 

as 

Y  =  <t>  n*  +  V  (16) 

where  $=(X  rY-]rX-,)r  V=  E  B'1  and  IT*  is  partitioned 

according  to  $.  Brundy  and  Jorgenson  showed  that18 

1 8 Ac tually  the  Brundy  and  Jorgenson  proof  was  for  the  case 
where  there  was  no  autocorrelation.  By  analogy,  Fair  has 
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any  set  of  instrumental  variables  based  on  a  consistent 
estimate  of  n*  will  result  in  asymptotically  efficient 
estimates  of  the  structural  parameter,  /3  ,  and  7,  .  II*  is  a 
function  of  the  B,  r,  and  R  matrices.  Therefore,  any 
consistent  estimate  of  the  latter  can  be  used  to  form  n* .  In 
the  absence  of  lagged  endogenous  variables,  application  of 
2SLS  to  each  structural  equation  provides  consistent 
estimates  of  the  structural  as  well  as  the  autocorrelation 
coefficients.  These  estimates  can  be  used  to  form  consistent 

A 

estimates  of  the  B,  T,  and  R  matrices  to  give  n*  which  in 

A  A 

turn  is  used  to  generate  Y  =  $  n* .  In  the  model 

y 1  - r  Y^  , - ^  =  (yi_r  Y 1  , -  1 ) 0 1  +  ( X ! ~ r  X ,  , . ^ ) 7 1  +  e,  (17) 
define  matrices  y,  and  Z,  and  5  as19 

Z,  =  [(¥,-?¥,,_,)  ,  (X,-?  X,,.,)] 

y  1  =  y  1  -  r  y  1 , . ! 

6'  =  (0i '  ,  7i  ' ) 

and  let  W,  be  a  set  of  instruments 

W,  =  [ (Y,-r  Y, , . 1 )  ,  (X,-r  X, , . ! ) ] 

where  r  is  a  consistent  estimate  of  r  obtained  in  the  first 

A  A  A 

stage  and  Y ,  =$11 ,  * ,  where  U:*  is  the  relevant  sub-matrix  of 

A 

n*  formed  using  the  consistent  estimates  of  the  B,  V  and  R 
from  the  first  stage.  Fair's  version  of  Brundy  and  Jorgenson 
estimator  (FBJ)  is 


’ 8 ( con t ’ d ) extended  their  proof  to  the  case  with 
autocorrelation. 

1 9  Note  that  so  far  we  have  been  assuming  that  the  value  of 
the  autocorrelation  coefficient  is  known.  Only  in  this 
section,  following  Fair,  we  assume  it  is  unknown  and  use  a 
consistent  estimate  of  it. 
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5  =  (W1,Z1)'1W1'y1  (18) 

This  estimator  is  consistent  and  if  iterated  is  fully 
efficient  within  the  class  of  limited  information 
estimators.  2  0 

It  must  be  noted  that  the  FBJ  estimator  can  include 
lagged  endogenous  variables.  However,  in  the  absence  of  such 
variables,  it  can  be  modified  at  the  initial  stage  by 
generating  Y  from  the  ordinary  reduced  form 
Y  =  X  n  +  V 

where 

n  =  -r  B" 1  ;  V  =  U  B" 1 

Since  autocorrelation  does  not  destroy  consistency,  we  can 
obtain  a  consistent  estimate  of  Y  using  n=-rB" 1 .  This 
simplification  can  substantially  reduce  the  computations  at 
the  initial  stage.  Its  effect  on  the  small  sample  properties 
of  the  FBJ  estimator  will  be  investigated  later.  It  should 


be  noted 

that 

this 

modi f i ed 

version  of 

the  FBJ  (MFBJ) 

estimator 

can 

use 

all 

T  observations. 

In  this  case  MFBJ  will 

take  the 

form 

• 

Zi  = 

(PY1 

PXi 

)  , 

• 

y  i  = 

Py  i 

• 

w,  = 

A 

(PY, 

PX1 

) 

A 

= 

(W,  ' 

Z 1 )  - 

'W,  ' 

• 

y  i 

(19) 

A 

Y  i  = 

A 

xn. 

where  n ! 

i  s  a 

consistent 

estimate  of  n. 

constructed  on 

the 

basis  of 

cons i stent 

estimates  of  B 

and  T . 

2  0 


See  Fair(1972),  pp. 446-47. 
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B.  The  Theil  Generalized  Two  Stage  Least  Squares  Approach 

A  different  approach  to  the  estimation  of  the 
structural  equation  with  first  order  autocorrelation  was 
suggested  by  Theil(1958).  In  this  section  we  shall  discuss 
the  Theil  estimator  and  will  consider  a  number  of  its 
variants.  One  of  the  important  charac ter i st ic  of  the  Theil 
type  estimators  is  that  they  employ  the  Pra i s-Wi nst en 
transformation  matrix  for  solving  the  autocorrelation 
problem  and  therefore  they  utilize  all  the  observations.  The 
asymptotic  properties  of  the  estimators  using  T  or  (T-1) 
observations  will  not  be  different  since  the  weight  of  the 
first  observation  will  vanish  as  T  approaches  infinity.21 
However,  the  employment  of  the  first  observation  might  have 
some  effect  on  the  small  sample  properties  of  these 
estimators . 

Theil  Estimator 

Theil(1958,  p.345)  suggested  a  generalized  2SLS  ( G2SLS ) 
method  for  estimation  of  the  system  represented  in  (1).  His 
estimator  can  be  summarized  as  follows: 

1-  First  transform  all  structural  equations  using  the  Prais 
Winsten  transformation  matrix  P 


21  The  importance  of  retaining  the  first  observation  is 
particularly  emphasized  when  the  exogenous  variables  are 
trended.  Effects  of  trended  data  on  the  properties  of 
different  estimators  was  studied  by  Hannan(  1960,  pp. 114-5, 
1  22-28  )  and  Grennander  and  Rosenblatt (  1 957  ,  pp. 23  1-254  ). 
However,  they  were  mainly  concerned  with  the  asymptotic 
properties . 
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PYB  +  PXT  =  PU  =  U 
or 

•  •  • 

Y  B  +  X  r  =  U  (20) 

The  reduced  form  of  the  system  (20)  can  be  written  as 

•  •  • 

Y  =  X  n  +  V  (21  ) 

2-  The  next  step  is  the  transformation  of  the  structural 
equation  under  consideration  in  a  similar  fashion 

y  t  =  Y  i  -i  +  Xi7,  +  e  i  =  Z  t  6  i  +  e  i  (22) 

3-  Finally  estimate  equation  (22)  by  OLS  replacing  Y,  by  its 
fitted  values  calculated  from  (21)  as 

A  A  A 

6i  =  (Z/Zj-'Z/y, 

where 

A  A. 

Zi  =  (Y,  ,  X,) 

A 

•  •  A  •••  •• 

Y,  =  X  n,  =  X(X'X) ' 1 X ’ Y ! 

To  construct  Theil's  estimator  we  first  need  a  estimate  of 
the  coefficient  of  autocorrelation  to  construct  the  matrix 
P.  A  consistent  estimate  of  r  can  be  obtained  by  applying 
2SLS,  ignoring  the  autocorrelation,  to  the  structural 
equation  and  calculating  r  using  the  Prai s-Winsten  formula 
defined  in  Chapter  I. 

Theil's  estimator  is  consistent  and  asymptotically 
efficient  within  the  class  of  limited  information  estimators 
if  and  only  if  the  non-zero  diagonal  elements  of  the  R 
matrix  are  all  equal,  a  result  recently  derived  in 
Buse (1983). 

As  in  the  case  of  regular  2SLS  the  G2SLS  estimator  can 
be  interpreted  as  an  instrumental  variable  estimator.  The 
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instrumental  variable  version  of  G2SLS  for  estimation  of  the 
system  (1)  is  as  follows: 

1-  Obtain  a  consistent  estimate  of  the  structural 
coefficients  and  thus  the  autocorrelation  parameter. 

2-  Transform  the  structural  equation  under  consideration  by 
the  Pra i s-Wi nst en  transformation  matrix 

PYl  =  PY  i  (3 ,  +  PXl7i  +  Pu ! 
or 

•  •  •  • 

y  1  =  Y 1  /3  +  Xl7l  +  e  =  Z  •,  5  +  e, 

3-  Let  W,  be  a  set  of  instruments  such  as 

/V 

•  • 

W,  =  (Y,  X,) 

where 

A 

•  •  •  •  •  • 

Y !  =  X(X'X)  1 X ' Y ! 

4- The  instrumental  variable  analogue  of  G2SLS  is 

5  ,  =  (W/Zj-'W/y, 

As  we  shall  see  later,  this  estimator  is  numerically 
equivalent  and  has  the  same  asymptotic  properties  as  Theil's 
G2SLS . 

Theil (II)  Estimator 

Theil's  G2SLS  first  transforms  the  autocor related 

system  to  one  with  no  autocorrelation  and  then  treats  it  as 

the  2SLS  problem.22  However,  another  estimator  can  be 

constructed  on  the  basis  of  the  same  principles  but  in  the 

22For  a  general  approach  to  the  derivation  of  G2SLS  and  its 
relation  to  the  Madansky ( 1 964 )  and  Wickens ( 1 969 )  estimators 
see  F i sher ( 1 972 ) .  It  should  be  noted  that  we  have  not 
considered  the  latter  estimators  since  they  are  shown  (see 
Fi sher (  1  972  ) ,  Wickens ( 1 969 ) )  to  be  asymptotically  less 
efficient  than  the  Theil  G2SLS  estimator. 
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reverse  order.23  This  means  that  one  first  resolves  the 
simultaneity  problem  in  the  system  and  then  tackles  the 
autocorrelation  problem.  This  procedure  also  results  in  a 
consistent  estimator,  which  we  shall  call  Theil(ll).  The 
Theil(II)  estimator  can  be  defined  as  follows  : 

1-  First  consider  the  reduced  form 

y  =  x  n  +  V  (23) 

We  can  obtain  a  consistent  estimate  of  the  Y's  by  applying 
OLS  to  ( 23  )  . 

2-  Second,  write  the  structural  equation  under  consideration 
as 

y  i  =  Y ,  /3  ,  +  X,7i  +  u,  +  0,0,  (24) 

where 

A 

Y,  =  X(X’ X) ‘ 1  X '  Y  , 

3-  Finally,  transform  equation  (24)  using  the  matrix  P  and 
estimate  the  resulting  equation  by  OLS 

5 1  =  (t’tj-'z/y, 

where 

Zi  =  (PY,  ,  PX,) 

As  shown  in  the  next  chapter,  the  Theil(Il)  estimator 
is  consistent  and  asymptotically  equal  to  the  G2SLS 
estimator. 

The  instrumental  variable  version  of  Theil(ll),  denoted 

as  Theil(IV),  can  be  constructed  as  follows: 

« 

1-  Let  W,  be  a  set  of  instruments  such  as 


23This  estimator  and  its  Instrumental  variable  analogue  were 
mainly  worked  out  in  Buse(1983). 
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W,  =  (Y,  X,) 

where 

a  A  A 

Y1  =  X(X'X)~  'X'Y,  and  Y,  =  PY , 

2-  Transform  the  structural  equation  (1)  using  a  consistent 
estimate  of  r. 

Pyi  =  PY^  +  PX !  7 1  +  Pu, 
or 

•  •  •  • 

y  i  =  Y  -i  /3  -|  +  X  i  7  1  +  e  i  =  Z  i  6  •)  +  e  ^ 

3-  Estimate  the  transformed  system  by  instrumental  variables 
as 

6  i  =  (W, ' Z, ) - 'W, '  y ! 

We  shall  show  later  that  like  Theil(Il),  Theil(lV)  has  the 
same  asymptotic  properties  as  Theil’s  G2SLS  estimator. 

Modified  General  i  zed  Tu/o  Stage  Least  Squares  Estimator 

A  modified  Generalized  Two  Stage  Least  Squares  (MG2SLS) 
method  for  estimation  of  (1)  can  be  constructed  as  follows: 

1-  At  the  first  stage,  use  the  ordinary  reduced  form 
and  regress  each  column  of  Y,  on  the  set  of  X's. 

2-  Using  OLS ,  estimate  (1)  using  the  fitted  values  of 
Y,  in  place  of  Y, .  Use  the  estimated  residuals  to  estimate  r 
and  transform  the  structural  equation  such  as 

y  i  -  r  y  i  ,  .  i  =  (Y,-r  Y ,  (  _  :  )  (5 ,  +  (X^r  X,  ,  -  ,  )?i  +  o  (25) 

3-  Estimate  equation  (25)  iteratively  by  OLS  replacing 
(Y,-r  Y1;_i)  by  its  fitted  value  from  a  regression  of 

A  A 

(Yrr  Y  t  _  i  )  on  exogenous  variables  transformed  as  (X  -r 


X.  !  )  . 


This  method,  which  was  suggested  by  Pindyck  and 
Rubinfeld  (1981)  is  just  the  Theil  estimator  which  uses  the 
Q  instead  of  the  prais  winsten  transformation  matrix.  It  is 
consistent  and  asymptotically  equivalent  to  the  Theil' s 
G2SLS . 

General i zed  Limited  Information  Maximum  Likelihood  Estimator 
As  shown  in  Appendix  I,  Sargan  and  Amemiya  have  derived 
(6)  under  restrictive  assumptions  which  can  affect  the 
asymptotic  and  small  sample  properties  of  their  estimator. 

It  is  also  shown  in  Appendix  I  that  under  different  set  of 
assumptions  one  can  derive  a  LI ML  estimator  that  has  the 
same  asymptotic  distribution  as  Theil’s  G2SLS.  The  intuitive 
explanation  of  this  estimator  that  we  shall  call  Generalized 
limited  information  maximum  likelihood  (GLIML)  can  be  given 
in  terms  of  the  so-called  "least  variance  ratio  principle". 
For  a  given  value  of  the  autocorrelation  coefficient,  the 
structural  equation 

yi  =  Y ,  /3  !  +  X,7i  +  u, 

can  be  transformed,  using  the  Q  ( Cochrane-Orcutt )  matrix,  to 
the  equation  free  of  autocorrelation;  namely , 

y1=Y1/31+X171+e1  (26) 

where  y1f  Y,  and  X,  are  transformed  variables. 

Equation  (26)  can  be  written  as 
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=  y i *  =  X,7,  +  e,  (27) 

where 

=  (1,~j3i')  and  Y ,  *  =  (y ,  Y,) 

The  "composite  dependent  variable"  y,*  is  a  linear 
combination  of  the  transformed  endogenous  variables  included 
in  the  first  structural  equation.  For  given  /3 ,  ,  the 
composite  variable  y,*  can  be  calculated  and  the  coefficient 
can  be  estimated  as 

7!  =  (X1,X1)"1X1’y1*  (28) 

Therefore,  the  sum  of  squared  residuals  of  the  equation  (27) 
will  be 

A  A 

$  1  =  V,*'  V,* 

where 

Vi*  =  y  !  *  -  y  ,  *  =  (I-X,  (X^Xj-'X,'  )y  ,  * 
therefore 

$  i  =  y  i  * ' (I-X! (x, -x,)-’!,' )y i * 

If  the  set  of  predetermined  variables  in  the  structural 
equation  (27)  is  extended  to  include  all  the  predetermined 
variables  of  the  system,  we  shall  have 

y1*=Xy+e1  (29) 

where  X  =(X,  Xx )  and  7'  =  (7, '  0)  and  Xx  is  the  set  of 
transformed  predetermined  variables  excluded  from  the  first 
structural  equation.  If  we  ignore  the  knowledge  about  the 
structure  of  the  coefficients  7  and  apply  OLS  to  equation 
(29)  we  will  have 

7  =  (X'X) " ’X' y 1  * 

The  sum  of  squared  residuals  corresponding  to  the  regression 
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of  the  composite  variable  y^*  on  all  the  transformed 
predetermined  variables  X  will  be  equal  to 

$2  =  yi*(i-x(x'x)_1x)yi* 

The  estimate  of  the  coefficient  j3 ,  can  be  obtained  by 
finding  the  values  of  /3  which  minimizes  the  effect  of  the 
additional  transformed  predetermined  variables  X*  on  the  sum 
of  squared  error  term.  In  other  words,  the  estimated  j3 ,  are 
the  values  which  minimizes  the  variance  ratio 
L  =  $  1  /  $  2 

L  is  always  greater  or  equal  to  one,  since  the  additional 
predetermined  variables  can  only  decrease  the  sum  of  squared 
errors  term. 

L  can  be  written  as 

L  =  (  0  !  *  *  7T  ,  0  !  *  )/  (  j3  !  *  '  17  2  P  1  *  )  (30) 

where 

7 r,  =  Y , * '  (I-X, (X, 'X, )" 1Xi '  )Yt* 

7T  2  =  Y  ,  *  '  (I-X(X'X)  •  1X’  )Y)  * 
where  /3 ,  *  and  Y^  are  defined  above. 

In  Appendix  I  we  show  that  minimization  of  L  with 
respect  to  /3  and  then  estimation  of  7,  from  (28)  will  give 
the  same  numerical  and  analytical  results  as  those  obtained 
by  partial  differentiation  of  the  following  concentrated 
likelihood  function  with  respect  to  j3 ,  and  7!  and  solving 
the  resultant  first  order  conditions, 

L2  =  K*  +  1/2  log  ( /3  1  *  '  7T2  j3  !  *  )  -  1/2  1  og  1 7r  2{—  1/2  log(e'e)  (31) 
Derivation  of  the  concentrated  likelihood  function  (31) 
and  its  difference  with  the  one  derived  by  Sargan  and 
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Amemiya  is  also  given  in  the  Appendix  I.  Moreover,  it  is 
shown  in  Chapter  III  that  under  the  assumption  of  equal 
autocorrelation  coefficients  across  equations  the  ALIML  and 
GLIML  estimators  are  asymptotically  equivalent. 

To  carry  out  the  GLIML  procedure,  we  need  to  know  the 
value  of  the  autocorrelation  coefficient  r.  A  consistent 
estimate  of  r  can  be  obtained  by  partial  differentiation  of 
the  concentrated  likelihood  function  (31)  with  respect  to  r 
and  solving  the  resultant  first  order  condition  for  r. 
However,  this  procedure  resulted  in  a  very  complicated 
formula  which  was  not  of  any  practical  value.24  As  an 
alternative,  a  search  over  r  on  the  interval  (-1,+1)  was 
made  and  estimates  of  r,  j3  ,  and  7,  which  minimize  the  sum  of 
squared  residuals  were  chosen.  It  is  clear  that  the  GLIML 
estimator  can  be  extended  to  employ  all  T  observations.  The 
interpretation  of  this  version  in  terms  of  the  so-called 
least  variance  ratio  principle  is  the  same  as  what  was 
presented  for  the  case  with  T- 1  observations.  Derivation  of 
the  concentrated  likelihood  function  in  this  case  is  the 
same  as  the  one  presented  in  Appendix  I  except  that  in  this 
case  the  Pra i s-Wi ns ten  transformation  matrix  will  be 
employed  for  transforming  the  variables.  The  log-likelihood 

24  The  second  alternative  was  to  calculate  r  from  one  of  the 
three  procedures:  CORC ,  Pra i s-Wi ns ten  or  the  Cubic  formula 
developed  by  Beach  and  Mackinnon.  Pilot  experiments  showed 
that  these  alternatives  often  made  the  concentrated 
log- 1 i kel ihood  function  reach  one  of  its  local  maxima 
instead  of  the  global  one.  This  result  can  be  explained  by 
the  fact  that  none  of  the  above  three  methods  are  the  direct 
solution  to  the  first  order  condition  but  they  are  only 
approx imat i ons . 
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function  of  the  system  in  this  case  will  have  an  extra  term 
which  is  the  Jacobian  of  the  transformation.  This  Jacobian 
does  not  cancel  out  in  the  process  of  concentrating  the 
log-likelihood  function  to  incorporate  the  coefficients  of 
the  first  structural  equation.  The  concentrated 
log-likelihood  function  in  this  case  is 

L2=K*  +  |  J I  +  1/2  log(/3i*'7r2  ft,*)-  1/2  log|7r^-1/2  log(e’e)  (32) 
which  is  the  extension  of  the  formula  given  in  (31)  except 
for  an  extra  term,  i.e.,  the  Jacobian  of  the  transformation, 
and  the  fact  that  in  (32)  variables  were  transformed  by  the 
P  matrix  instead  of  the  Cochrane-Orcut t  transformation  used 
in  (31). 

Other  Est imators 

Other  limited  information  estimators  have  been  proposed 
for  estimation  of  autoregressive  simultaneous  equation 
system.  Two  of  these  methods  that  we  shall  review  here  are 
those  proposed  by  Klein(1974)  and  Kmen ta ( 1 97 1 ) . 

1.  The  Klein  Estimator 

An  alternative  approach  for  estimation  of  the  system 
presented  in  (1)  and  (2)  was  suggested  by  Klein(1974, 
p.207-210).  He  argued  against  the  employment  of  the 
augmented  reduced  form  on  the  grounds  that  the  enlargement 
of  the  vector  of  reduced  form  regressors  taxes  the  degrees 
of  freedom  and  employment  of  X  and  X_  as  regressors  raises 
the  possibility  of  strong  mult icoll inear ity .  He  suggested 
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using  the  ordinary  reduced  form,  while  taking  into  account 
the  information  that  the  reduced  form  disturbances  satisfy 
an  autoregressive  scheme.  His  proposed  method  amounts  to  an 
iterative  scheme  using  the  generalized  least  squares  method 
at  the  initial  stage.  Then,  using  the  fitted  values  of  the 
jointly  dependent  variables  from  the  first  stage,  estimate 
iteratively  the  transformed  equation  (4).  The  main  problem 
with  this  approach  is  in  the  estimation  of  the 
variance-covariance  matrix  of  the  reduced  form  at  the  first 
stage.  Write  the  ordinary  reduced  form  (equation  (11), 
chapter  I )  as 

V  =  X  n  +  V  (33) 

where 

V  =  U  B"  1 

and 

V t  =  (ut  ,,..., u,  )B" 1 
or 

V  t  =  (Lb  j  1  u  t  j  ,  .  .  .  ,  Lb  j£r  u  t  j  ) 

Where  bij  is  the  (i,j)  element  of  the  inverse  of  B. 
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Therefore 


40 


E(Vt  j  )  =  0 


E  ( V  t  i  2 )  =  a  2  {£[(bji  2)/  (  1  -r  j  2  )  ]  }  (34) 

ki 

so  that  equation  (34)  shows  that  the  estimation  of  the 
variance  of  the  ith  reduced  form  disturbance  requires  the 
knowledge  of  the  autocorrelation  coefficients  in  all  the 
equations  of  the  system.  Only  under  the  restrictive 
assumption  that  the  diagonal  elements  of  R  matrix  are  all 
equal  can  the  variance-covariance  matrix  of  the  reduced  form 
disturbances  be  easily  calculated. 

Klein,  however,  overlooked  this  problem  and  proposed 
the  following  procedure  for  estimation  of  the 
variance-covariance  matrix  of  the  reduced  form  disturbances: 

1-  Regress  Yt  on  the  set  of  exogenous  variables. 

2-  Use  the  residuals  of  the  first  step  to  compute  the 

covariance  matrix  as 

/ 

/ 

/S 

£2  =  j 


/\ 

A  feasible  GLS  estimator  requires  Of1  but  it  is  clear 


that  matrix  £2;  is  singular  with  rank  one  and  therefore  the 
Klein  method  is  not  operational. 


2.  The  Kmenta  Estimator 

Kmenta(1971,  pp. 587-589)  also  proposed  a  method  for 
estimation  of  the  model  (1).  He  explicitly  assumed  the 


absence  of  lagged  endogenous  variables  in  his  model.  His 
estimator  is  as  follows: 

1-  Using  the  augmented  reduced  form,  regress  each 
column  of  Y ,  on  the  lagged  endogenous  variables  as  well  as 
current  and  lagged  values  of  the  exogenous  variables  using 
OLS. 

2-  Using  the  fitted  values  of  the  Y1  and  Y1;_i  in  the 
transformed  equation  (3)  as 

Y  i  "  r  y  !  ,  .  i  =  ( Y  i  ~r  Y  ,  ;  .  )  (5  ,  +  (X,-r  X  ,  ,  _  ,  )  y  ,  +  e,  (35) 
Estimating  the  coefficients  r,  /3 ,  ,  7,  by  applying  the 
restricted  least  squares  method  to  (35). 

The  list  of  instruments  in  Kmenta's  method  is  the  same 
as  Sargan's  and  thus  is  subject  to  the  same  criticisms; 
especially  if  the  X's  are  subject  to  some  sort  of  trend,  the 
appearance  of  X’s  and  X. '  s  among  the  instruments  might 
create  severe  mult icoll inear ity .  Moreover,  obtaining  the 
fitted  values  of  the  endogenous  variables  from  the  augmented 
reduced  form  automatically  omits  one  observation.  Therefore, 
his  estimator  loses  another  observation  due  to  the 

A 

employment  of  Y1(_i  in  the  second  stage  regression.  In 
empirical  work,  where  researchers  usually  have  a  limited 
number  of  observations  losing  two  observations  is  not 
desirable. 

Table^-1  presents  the  general  structure  of  the 
estimators  discussed  in  this  chapter.  Tablet-2  shows  the 
mathematical  presentation  of  Tablei-1  .  The  computational 
summary  of  each  of  the  estimators  is  given  in  Appendix  II. 


Table  2-1:  Simultaneity  and  Autocorrelation 
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Ordinary  Least  Squar 


III.  ASYMPTOTIC  PROPERTIES  OF  THE  ESTIMATORS 


This  chapter  compares  the  asymptotic  properties  of 
different  estimators  discussed  earlier.  We  will  assume  that 
the  autocorrelation  coefficients  are  known  to  prevent 
complications  which  can  arise  due  to  the  substitution  of  the 
consistent  estimates  for  the  unknown  autocorrelation 
coefficients.  This  is  in  fact  a  common  practice.25  It  is 
known  from  general  statistical  principles  that  if 
autocorrelation  coefficient  is  unknown  a  greater  uncertainty 
(variability)  will  be  introduced  for  all  the  coefficients 
and  we  can  conjecture  that  this  effect  will  be  roughly  the 
same  across  all  estimators. 

The  order  of  the  discussion  of  the  estimators  will  be 
the  same  as  in  Chapter  2.  We  shall  first  present  the 
asymptotic  properties  of  the  ALIML  estimator  and  then  use 
that  as  a  base  for  comparison. 

The  ALIML  Estimator 

Consider  the  following  simultaneous  equation  system 

Y  B  +  X  T  =  U  (  1  ) 

U  =  U_  •,  R  +  E 
E  ( E  t  '  E  t  )  =  © 

Let  the  structural  equation  under  consideration  be 
y  i  =  Y,  0  ,  +  X,7,  +  u 1 
or 


25  See  for  example  Wic kens ( 1 969 ) ,  Fair (  1  970 , 1  972 )  and 
F  i  sher (  1  972). 
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y  1  =  2,5,  +  u, 


(2) 


u,  =  r,  U ! , _ !  +  € 


E  (  €  1  e  !  '  )  =  O  1  !  I 


E(u,U! ’ )  =  a t 1  Z 


where 


Z,  =  and  5,'  =  03,’  7’  ) 

In  the  subsequent  discussion  we  will  make  the  usual 
assumptions  that  Plim  (X'E/T)  =  0,  and  Plim  (X'Z_1X/T)  is  a 
K.K  positive  definite  matrix. 

Denote 


Y  1  *  =  ( Y  !  Y  1  ) 


The  ALIML  estimator  is  equal  to 


where  X  is  the  smallest  root  of  the  determinantal  equation 


7T  1  *  -  X  7T* 


and 


7T 1  *  =  y,  *'  (i-x,  (x,  rx, )  ‘  'x,  '  )y,* 

7r*  =  y  !  *  ?  ( I  ~4>  ( )  “  1 '  )Y  !  * 


and 


4>  =  (X  ,  Y.  ,  ,  X.  ,  ) 


Amemiya  showed  that  ALIML  is  consistent  and  its  asymptotic 
covariance  matrix  is  equal  to 
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Asy-cov 


,  A 

0,-0, 


A 

\ 7 i “7 i 


/ 


o 1  t  pi im 


I  Yi4>(<J>'4>)~1<£>'Y1 
\x,  ’  Y, 


V 

X 


X! 


X, 


(3) 


Fair's  Estimator 


Fair  has  not  derived  the  asymptotic  distribution  of  his 
estimator.26  We  do  not  do  so  either  but  we  do  provide  an 
alternative  proof  of  consistency  to  that  given  by  Fair. 

To  prove  the  consistency  of  Fair's  estimator,  for  known 
r,  we  can  combine  equations  (10)  and  (14)  of  Chapter  2  to 
obta i n 


[l\ 

1- 
Y  i 

'Xm(Xm'Xm>- 'Xm'Y, 

Y,  ’ 

X,  \ 

\h 

Is  > 

Y, 

X, 

-X,  ) 

1 


Y,  ’Xm(Xm’Xm) -  1 Xm '  ) 

Xi  ' 

Substituting  for  y1f  we  will  have 

'y, TXm(Xm’Xm)- 1Xm,Y1  Y, ’X, 

X, ' Y !  X, ' X 

'y, ' xm ( Xm ' Xm ) " 'Xm'€, 

x,  '  e  , 


"14 

A 

— 

\7n 

i 


2 ‘Actually  Fair  proposed  a  number  of  different  alternative 
estimators  in  his  paper.  In  this  thesis  we  take  the  Fair 
estimator  to  mean  the  three  step  estimator  outlined  in 
Chapter  2. 


It  is  easy  to  show  that  all  the  plims  involving  Y 1 ' Xm  exist. 
Furthermore,  we  know  by  assumption  that 
PlimU,  '  e  ,/  T)  =  0 
Pi im ( Xm ' e J  T)  =  0 
Hence  we  have 


which  proves  the  consistency  of  the  Fair  estimator. 

We  showed  above  that  the  Fair  estimator  is  identical  to 
S2SLS  estimator  if  it  uses  all  the  instruments  appearing  in 
the  augmented  reduced  form  of  the  system  under 
consideration.  However,  it  uses  only  a  sub-set  of  those 
instruments  and  therefore  is  asymptotically  less  efficient 
than  the  ALIML  and  S2SLS  estimators. 

6-  The  Brundy  and  Jorgenson  Estimators 

The  Fair  version  of  Brundy  and  Jorgenson's  estimator 
utilizes  the  augmented  reduced  form  where  the  n's  are  formed 
using  the  consistent  estimates  of  B,  T ,  and  R.  Fair(1972) 
showed  that  this  estimator  is  consistent  and  asymptotically 
efficient  within  the  class  of  limited  information 
estimators. 

To  utilize  all  observations,  we  introduced  a  modified 
version  of  the  Brundy  and  Jorgenson  estimator  which  uses  the 
ordinary  reduced  form.  The  proof  of  the  consistency  of  this 
estimator  is  the  same  as  for  the  instrumental  variable 
version  of  the  Theil(ll)  estimator  considered  below. 
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Following  the  same  procedure  we  can  obtain  the  asymptotic 
distribution  of  the  modified  Brundy  and  Jorgenson  estimator 
as 

/T~  (  5  -i  —  6  -,  )  - - -  N[  0  ,  o  ,  1  pi  im  (W3  '  Z  ,  )  '  1  ] 

where 

W3  =  [  P  Y  -i  ,  P  X,] 

•  •  • 

Z,  =  (Y,  X, ) 

A. 

and  Y ,  is  the  fitted  values  of  Y ,  calculated  from  the 
consistent  estimate  of  the  ordinary  reduced  form. 

A 

Since  Y,  is  calculated  from  a  consistent  estimate  of 

the  reduced  form  coef f ic ients ,  the  asymptotic  variance 

covariance  matrix  of  the  Brundy  and  Jorgenson  estimator  when 

it  uses  the  ordinary  reduced  form  would  be  equal  to  Theil's 

G2SLS  that  is  discussed  later  in  this  chapter.27 

Comparison  of  the  modified  Brundy  and  Jorgenson 

estimator  when  it  uses  T  or  T- 1  observations  will 

demonstrate  the  small  sample  efficiency  gain  due  to  the 

utilization  of  the  first  observation.  On  the  other  hand 

comparing  it  with  Fair's  Brundy  and  Jorgenson  estimator 

which  also  uses  T- 1  Observations  will  show  the  small  sample 

differences  due  to  employment  of  two  different  reduced 

forms.  Finally,  comparison  of  the  modified  Brundy  and 

Jorgenson  estimator  when  it  uses  all  T  observations  with  the 

Fair  version  shows  the  small  sample  efficiency  differences 

of  using  the  augmented  reduced  form  while  losing  one 

27  As  we  shall  see  the  Theil  estimator  is  generally  less 
efficient  than  the  ALIML  estimator.  Therefore,  the  BJ 
estimator  will,  in  general,  be  less  efficient  than  the  ALIML 
estimator. 


observation,  against  using  the  ordinary  reduced  form  while 
retaining  the  first  observation. 
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Thei 1 ' s  Est imator  ( G2SLS ) 

In  this  and  the  subsequent  sections  we  shall  deal  with 
the  asymptotic  properties  of  the  Theil  estimator  and  its 
variants.  We  will  go  into  a  detailed  discussion  of  the 
properties  of  these  estimators  simply  because  the  literature 
is  incomplete  on  this  topic. 

The  reduced  form  of  the  system  (1)  can  be  written  as 
Y  =  X  n  +  V  (4) 

Transform  (4)  by  the  Pra i s-Wi nst en  matrix  P 
PY  =  PXn  +  PV 


or 


(5) 


y  =  x  n  +  v 


Apply  OLS  to  the  system  (5)  to  obtain  the  fitted  values  of 
Y !  as 


•  •  s\  •  •  •  •  • 

Y,  =  X  n,  =  X ( X ' X ) “ 1 X ' Y 


Then  Theil’ s  G2SLS  estimator  of  6 


of  6  i  is 

•  •  _  I  •  •  •  •  0 

Y  1  '  X  ,  \  [Y  i  '  X  ( X '  X )  "  1  X 


/  Y  !  ’  X  ( X  ’  X )  ‘  1  X '  Y 


A 


(6) 


5 


X 


The  consistency  of  the  Theil  estimator  for  the  case 


with  known  and  unknown  autocorrelation  coefficient  is 
demonstrated  by  Wickens.  The  asymptotic  covariance  matrix  of 
the  G2SLS  estimator  when  r  is  known  is  given  by 
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Asy-cov 


•  ••  •  •  . 

Y  ,  X  ( X  '  X  )  -  1  X  ’  Y  ,  Y/X,  \ 


a  , ,  pi  im 


(7) 


\X,'Y1 


X,  / 


I  ( n , '  x '  xn ,  /t  )  (n, 1  x'Xt/t)'  In, ’in,  n,'i,\ 


pi  im 


=  A  (8) 


where 


V  =  pi im (X ’ X/T ) 
i,  =  plim(X’X1/T) 

V, ,  =  plim(X, ’X,/T) 


(11) 


(10) 


(9) 


are  assumed  to  exist. 

It  is  clear  that  the  asymptotic  covariance  matrix  of 
the  Theil  estimator  given  by  equation  (8)  is  numerically 
different  from  the  asymptotic  covariance  matrix  of  ALIML 
estimator  (equation  (3)).  The  numerical  difference  of  ALIML 
and  Theil's  G2SLS  covariance  matrices  raises  the  question  of 
which  one  is  efficient. 

At  this  point  we  should  note  that  G2SLS  is  constructed 
using  T  observations  while  ALIML  utilizes  only  T- 1 
observations.  However,  since  the  weight  of  the  first 
observation  is  asymptotically  negligible,  in  the  following 
discussion  we  shall  assume  that  all  variables  are 
transformed  by  the  Q  ( Cochrane-Orcut t )  matrix. 


The  relative  efficiency  of  ALIML  and  G2SLS  has  been 
investigated  in  Buse(1983),  which  we  follow  here.  First 
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introduce  a  selection  matrix  Sx  such  that 
$  Sx  =  X 

where  Sx  =  (I  ,0,0) 

Therefore,  we  can  write  X  as 

X  =  X  -  r  ,X.  i  =  4>(SX  ~  r,Sx  (  .  i  )  (12) 

Now  using  the  reduced  form  Y,  =  X  +  V, ,  the  first 
diagonal  term  in  the  asymptotic  covariance  matrix  (3)  will 
become 

y /$($'$)- ’s’y,  =  (x  n,  +  n,  +  v,)  (13) 

Using  the  result  in  (12),  equation  (13)  will  become 


y , 1  $ ( <f> ' <j> ) ■ 1 ' y i  =  n,'x'x  n,  +  n/x'v, 

+  V, ’X  n,  +  v, ’ $($' $) " v,  (14) 

Taking  probability  limit  of  each  term  in  (14),  we  shall  have 

piim  (n/x'x  n,/T)  =  n,'?  n,  (15) 

Plim  ( IT !  'X'  Vt/T)  =  0  (16) 

Plim  (V,  '$($'$)  '  1<t>'  V,/T)  =  T'  ($'$)■  ’T  (17) 


where 

T  =  plim  ($>'  V,/T)  4  0 

since 

rv,  =  (X-r,X.,  ,  Y.rrj.z  ,  X.rr^.j)' 

. (U  -  r1U.„)(B-1),' 

and  therefore  Y. i  and  Y_2  will  be  correlated  with  U  and  U_ i 
and  thus  T  will  be  non-zero.  Using  the  results  in  (15),  (16) 

and  (17)  and  taking  the  probability  limit  of  the 
off-diagonal  terms  of  the  equation  (3),  the  asymptotic 
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covariance  of  ALIML  will  become 


/  n1  n,  +  T'  ($'$)  t  n. 


Asy-Cov  ( ALIML  )=t/i=o  ,  , 


(18) 


\  * i  ’  n 


1 


/ 


where  $ ,  $  i  and  are  defined  before. 

G2SLS  is  efficient  relative  to  ALIML  only  if  A -\p  is 
positive  semi-definite  matrix.  This  will  be  so  if  and  only 
if  A'1  is  positive  semi-definite.  Comparing  (18)  with 

the  asymptotic  covariance  matrix  of  G2SLS,  A,  defined  in  (8) 
we  shall  have 


f  T’  (<f> '  $ )  T  0  \ 


iT  1  -  A" 


0  / 


\  o 


which  is  a  positive  semi-definite  matrix.  Therefore  ALIML  is 
at  least  as  efficient  as  Theil's  G2SLS  estimator.  Only  under 
the  condition  that  all  the  autocorrelation  coefficients  are 
equal  we  shall  have 
U  -  r ,U. i  =  E 

and  therefore  by  the  assumption  that  E  is  uncorrelated  with 
Y_  i  and  Y_2  and  thus  the  plim(<f>'  Vh/T)  will  be  equal  to  zero 
that  means  the  G2SLS  and  ALIML  estimators  are  identically 
ef  f  ic  ient .  2 8 


28  Wickens ( 1 969 )  has  argued  that  the  Theil  estimator  with 
known  or  estimated  autocorrelation  coefficient  is  efficient 
within  the  class  of  limited  information.  However,  the  above 
proof  shows  the  fallacy  of  his  argument. 
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An  Instrumental  Variable  Interpretation  of  the  G2SLS 
Est i mat or 

As  was  mentioned  above,  Theil's  G2SLS  estimator  can  be 
interpreted  as  an  instrumental  variable  estimator. 
Demonstration  of  this  property,  even  though  straightforward, 
can  shed  some  light  on  the  general  approach  of  the  Theil 
type  estimators  and  therefore  can  be  useful  for 
interpretations . 

Consider  estimation  of  the  first  structural  equation. 
Let  W !  be  a  set  of  instruments  such  as 

.  A 

w  t  =  (y,  x,) 

where 

^  ...  .  . 

Y,  =  X  (X'  X) " 1  X'  Y, 

and 

•  •  • 

y i =  p  y,=  p  x  n,+  p  v ! =  x  n  +v. 

Note  that 

^  * 

Y,  =  ■Y1  +  V, 

and  from  the  properties  of  OLS  we  have  the  following 
orthogonality  conditions 

Y i ' V !  =0 

.  ^ 

X/V,  =0 

The  structural  equation  under  consideration  can  be  written 
as 

P  Yl  =  P  Y,  0,  +P  X,  7l  +  P  U, 

or 

•  •  • 

y  i  =  z  i  5  i  +  u  i 

Therefore,  the  instrumental  variable  version  of  G2SLS  will 
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be 


A 


5 ,  =  (W, ' Z , ) -  1  W, ’y ,  (19) 

To  prove  the  equivalence  of  this  estimator  and  the  G2SLS 

•  • 

estimator  we  first  note  that  W^Z,  can  be  written  as 

/ Y,’\  f  Y  ,  '  Y  ,  Y X , 


W,  '  Z 


( Y i  X,)  = 


'  Y  ' 

a  yy  A 

Y/lYqV,)  Y  i  '  X  ^ 


X/Y,  X/X, 

A  A  A 


/h’Y, 


Y,  'X, 


A  * 


A 


(20) 


\  X,  '  (Y,+V,  )  X,  'xj  \X,  ’  Y1  X,  ’X!  / 

A  A  .A 

since  Y/Vh  and  X,  'V,  are  equal  to  zero  given  the 

•  • 

orthogonality  properties  of  OLS .  The  term  W : ' y ,  can  also  be 
expanded  as 
/  Y1  0 

1  •  f 

(21  ) 

\X  i  '  Y  i  I 

Using  (20)  and  (21),  reduces  (19)  to  the  original  Theil 
estimator 


A 

6  i  = 


w,  'y 


. 

Y  i 


\  y  t 

\xi 


A  A  A  \ 

Y  i  '  Y  !  Y  -i  'X,  j 

-1 

M  •  \ 

Y 1  y  1  \ 

A  . 

\X1'Y1  X  !  1  X  ,  J 

X ,  ’  y  ,  / 

55 


An  Alternative  Instrumental  Variable  G2SLS 
Est imatoriThei 1 ( IV ) ) 

The  instrumental  variable  interpretation  of  Theil's 
estimator  uses  the  estimated  transformed  endogenous 
variables  as  instruments.  However,  we  can  introduce  another 
instrumental  variable  estimator  which  uses  the  transformed 
estimated  endogenous  variables  as  instruments.  It  turns  out 
that  this  estimator  is  asymptotically  equivalent  to  Theil's 
G2SLS  estimator.  We  will  call  this  estimator  Theil(IV). 

Consider  again  the  first  structural  equation  under 
considerat ion 

y  i  =  Y  /3  +  Xt7i  +  Ut  =  Z  -i  5  +  u,  (22) 

where  Z ,  and  6,  are  defined  as  before. 

Transform  equation  (22)  using  the  Pra i s-Wi nsten 

transformation  matrix  to  obtain 

•  •  • 

y1=Z1S1+u1  (23) 

To  estimate  equation  (23),  denote  W2  as  a  set  of  instruments 
defined  by 

•  A 

w2  =  ( y i  x,) 

Where 

A 

Y,  =  X(X'X) " 1 X ' Y !  (24) 

V,  =  Y,-?,  =  (I-X(X,X)'1X')V1 

A  A 

Y,  =  PYt  =  PX(X'X)'1X'Y1  =  pxn,  +  PX(X'X) ' ’X'V,  (25) 

A  AAA 

PY,  =  PY,  +  PV,  =  Y,  +  V,  (26) 

The  instrumental  variable  estimator  of  equation  (23)  will  be 
equal  to 


. 
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6  iv  =  (W2,Z1)'1W2?y1  (27) 

which  can  also  be  written  as 

S1V=  (w2  '  Z  ,  )  -  ’W2  f  (  Z  ,  6,  +u,  )  =  5,  +(W2  '  Z  !  )  -  'W2  '  u,  (28) 
Therefore,  we  have 

PI im ( 5 iy  )  =  6,  +  Plim(W2 ' Z ,/T) ~ 1 (W2 ' u,/T)  (29) 

For  consistency  we  need  to  show  that  the  second  term  on  the 

right  hand  side  of  (29)  is  equal  to  zero.  To  do  this,  we 

•  • 

first  expand  W2 ' Z,  to  yield 


•  ♦ 


plim(W2 ' Z i/T)  =  plim 


(  Y  !  '  P '  P  Y  !  /T )  ( Y  !  T  P  ’  P  X  i  /T )\ 


(30) 


\  (Xt’P'P  Y,/T)  (X/P'P  X  1  /T  ); 

Replacing  Y,  by  Y^V,  we  will  get 

•  • 

Plim  (W2'Z1/T)  = 

I ( Y ,  ' P ' P  Y1/T)  +  (Y,'P,P  V t /T )  (Y,  ' P ' P  X , /T ) \ 


Plim 


-I 


X , ' P ' P  Y1/T)+(X1'P'P  V,/T)  ( X ! ' P ' P  X,/T) 


=A  (31) 


since  following  Wic kens ( 1 969 )  we  have 

pi im ( Y ! ' P ' P  V,/T)  =  pi im ( X  t  ' P ' P  V , /T )  =  0 
and  A  is  the  covariance  matrix  of  the  original  Theil  G2SLS 
estimator . 

Now  consider  pi im (W2 ’ u , /T) .  We  can  write  this  term  as 

Y i ?  u i /T  /plimn, ?X' u,/T  0  \ 


•  • 


pi im ( W2 ' u ! /T )  =  plim 


•  • 


\ X ! ' U)/T 


\pl im  X ! ' u i /T 


(32) 


0  / 


Substituting  (31)  and  (32)  in  (29)  proves  the  consistency  of 


this  estimator. 
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To  investigate  the  asymptotic  distribution  of  this 
estimator,  using  equation  (28),  we  have 

/T  (61v-6,)  =  ( W  2  '  Z  ,  /T )  “  1  W2’u1//T  (33) 

Therefore,  the  asymptotic  distribution  of  /t(5w-  5,)  will 

•  • 

be  equal  to  the  pi im ( W2 ' Z , /T ) " 1  times  asymptotic 

•  • 

distribution  of  (W2'u,//t).  Following  Thei 1 ( 1 97 1 , pp. 380-38 1 ) 
we  can  show  that 


•  • 


•  • 


(W2'ui//T)  - - N(0,  a i  i  plim(W2 ' W2/T) ) 

•  • 

The  probability  limit  of  (W2'W2/T)  can  be  written  as 

( Y 1 ' P ’ P  Y i /T )  ( Y ! ' P ' P  X ,  /T ) 


•  • 


pi im ( W2 ' W2 /T )  =  plim 


A 


(X/P’P  Y,/T)  (X,'P'P  X,/T), 


( n ! ’ x ’ xn ! /t )  ( n , ’ x ’ x ! /t i 

n,'i  n,  n, ’i,  \ 

( x i 1 xn ! /t )  (x, ' x i /t )  , 

\T  '  n ,  i , ,  i 

•  *  •* 

where  $ ,  ,  and  are  defined  in  equations  (9), 

(11).  Hence 

/t  (  5  iV—  5  -i  )  N  (  0  ,  a,  ,A  A"  1  A )  N(0,  on  A) 

where  A  is  consistently  estimated  by  (W2 1 Z , ) " 1 . 


=A '  (34) 


(10)  and 


(35) 


Thei 1(11)  Estimator 

The  difference  between  Theil’s  original  estimator  and 
its  second  version  is  the  order  in  which  the  two  problems, 
simultaneity  and  autocorrelation,  are  tackled.  We  have 
presented  the  second  version  in  terms  of  instrumental 
variables  but  it  is  possible  to  obtain  an  asymptotically 
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equivalent  estimator  by  folowing  the  usual  2SLS  approach  and 
replacing  the  right  hand  endogenous  variables  by  their 
estimated  values,  transforming  the  equation  by  P  and  then 
applying  OLS .  We  will  call  this  estimator  Theil(ll). 

First  write  the  reduced  form  (3)  as 

A  A 

y i  =  x  n, 

where 

n ,  =  (X'X)~'X'Y,  (36) 

Therefore,  the  transformed  fitted  values  of  Y,  will  be 


Y  =  PY  =  PX(X'X)'1X'Y1 
Thus  the  Theil(II)  estimator  is 


(37) 


✓\  /\ 


Y  ,  '  Y  ,  Y  ,  ’  X  ,  \  Y  ,  ' 


A 

Si  = 


\X1’Y1  X/X,, 


y  i 


(38) 


X 1  '  / 


Consistency  of  this  estimator  can  be  proved  as  follows 


Since  Y,  =  Y,  +  V( 


where 


A 


Y,  =  X(X' X) " ’X' Y , 

:he  structural  equation  to  be  estimated  can  be  written  as 
Py  i  =  PY  i  /3 1  +  PX  i  7  t  +Pu  !  +PV  t  0  i 


or 


A 


y  1  =  Y,/3  !  +  X,7,  +  u. 

Equation  (39)  can  be  written  as 
•  •  • 
y i  =  W2  5 i  +  w, 


V1/31 


(39) 


W2  =  (Y,  X, ) 


where 
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*  *  A 

w  1  =  U  ,  +  V,  0  ! 


Therefore 

A 


6, 


the  estimator  of 


(W2’W2)- 1 W2 ' y ,  = 


6 i  using  OLS  will  be 
•  •  »  •  • 

(  W  2  '  W2  )  ~  1  W2  '  (  Z  •,  6  1  +  u,) 


=  (W2'W2)-1W2'Z161  +  (W2 ’W2 ) - ’W2 ’ Ul  (40) 

For  consistency  to  hold,  we  need  to  show  that  the 
probability  limit  of  the  first  term  on  the  right  hand  side 
of  equation  (40)  is  equal  to  6  <  and  that  of  the  second  term 
is  equal  to  zero.  From  equations  (31)  and  (34)  we  have 


Plim(W2 ’W2/T)  =  £'  (41 ) 

Pi im (W2 ’ Z ,/T)  =  A  *  (42) 


To  prove  the  orthogonality  of  W2  and  u,  note  that 


A 


•  • 


pi im (W2 ’ u t /T)  =  plim 


( Y ! ’ ? ' ?  u,/T) 


• (X, ’?’?  u,/T) 


=  0 


(43) 


A 

Since  Y  t  and  are  orthogonal  to  u, . 

Therefore,  taking  the  probability  limit  of  equation 
(40)  using  the  results  in  (41),  (42),  and  (43)  will  prove 

the  consistency  of  the  Theil(Il)  est imator , i . e . 

plim  (6-i)  =  (A)'1(A)51  +  (A)  plim(W2'u1)  =  6, 

To  evaluate  the  asymptotic  efficiency  of  the  Theil(Il) 
estimator,  we  shall  show,  following  Buse(1983),  that  its 
asymptotic  distribution  is  the  same  as  the  Theil(IV) 
estimator  and  therefore  is  equal  to  that  of  the  Theil's 
G2SLS  estimator.  To  demonstrate  this,  write  the  Theil(lV) 
estimator  as 


(W2 ' Z , ) 


A 

6 1  v  =  W  2  '  y  i 


Substituting  for  Z,  by 


. 
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2 i = { Y ,  ,  X, )  = 


A  A 

(Y,+V, 


X,  )=  W 


A 

2  +  (Vi 


0)  =  W2  +  A 


the  Theil(lV)  estimator  will  become 

( W  2  '  W  2  +W  2  '  A )  5iv  =  W2  '  y  i  (44) 

We  can  write  the  Theil(II)  estimator  as 

(W2'W2)  5 !  =  W2'y1  (45) 

Subtracting  (45)  from  (44)  and  re-arranging  the  terms,  we 
will  have 


v*r  (8 


A 


IV 


-  ( T“ 

1  W  2  ' 

w2 )  - 

• 

-i 

2  '  W  2 

)  = 

A.  T 

i  on 

(46) 

i  s 

A 

V  '  D 

A 


(46) 


Plim  (T" 1 W2 ’ A)  =  Plim  T 


-  i 


Y  t  ' P ' P  V, 


X, ' P' P  V 


-  0 


0 


Given  the  assumptions  made  above  we  have 

Plim  ( Y  -i  '  P '  P  Vt/T)  =  Plim  (X/P'P  V,  )  =  0 
Since  </ T  5,v  has  a  well  defined  distribution  the  right  hand 
side  of  equation  (46)  converges  asymptotically  to  zero.  In 
other  words,  the  asymptotic  distribution  of  the  Theil(II) 
estimator  is  equal  to  Theil(IV)  estimator  and  hence  equal  to 
the  Theil  G2SLS  estimator.  However,  the  small  sample 
characteristics  of  Theil(Il)  and  Theil(lV)  estimators  will 
be  different  from  that  of  the  Theil's  G2SLS  estimator  due  to 
the  presence  of  the  additional  terms  in  (31)  and  (42)  that 
only  asymptotically  converge  to  zero. 

To  summarize,  we  have  shown  that  the  Theil  type 
estimators  that  use  a  consistent  estimate  of  the  ordinary 
reduced  form  coefficients  are  asymptotically  equivalent. 
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That  is  to  say  the  order  in  which  the  transformation  to  the 
standard  form  and  transformation  for  orthogonality  are 
handled  does  not  affect  the  asymptotic  properties. 

The  Generalized  LIML  Estimator 

To  derive  the  asymptotic  properties  of  GLIML,  we  must 
first  differentiate  equation  (31)  of  Chapter  II  with  respect 
to  Pi*  =(1  ,  ~P i ' ) '  and  7,  for  a  given  value  of  ’ r ’  and  set 
the  result  equal  to  zero  to  obtain  estimates  of  the 
coefficients.  This  process  will  result  in  the  following 
equat ions 


()/)Pi*)  [(Pi*1  tt2  p  i*)  /P  i*1  n  i  Pi*)]  =  0 
7 1  =  (X/Xj-'X/Y,*  Pi* 


(47) 


(48) 


where 


7r  i  =  Y,*'  (I-X,  (X,  'X,  )  -  'X,  '  )Y,  * 


7T  2  =  Y,*'  (I-X(X'X)  '  1  X )  Y  i  * 

•  •  • 
where  Y,*  =  ( y  -i  ,  Y 1  ) 

Equations  (47)  and  (48)  can  be  combined,  noting  that 
Pi*  =  (1  -p  i  1 ) T  ,  to  give 


(49) 


where  X  is  equal  to 

X  =  Min  [(0!*  7 r,  Pi*')/(Pi*  r2  P  i  * '  )  ] 


Equivalently,  X  is  the  smallest  root  of  the  determi nantal 
equa t i on 
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I  7T  !  ~  X  7T  2  |  =0 

It  is  well  known  that  LIML  estimator  is  a  member  of  the 
K-class  estimators  with  K=X .  Moreover,  it  can  be  shown 
that 2  9 

pi im ( X  -  1 )  =  0 

To  demonstrate  consistency  we  take  the  probability  limit  of 

the  equation  (49),  noting  that  plim(X)=1,  and  substituting 
•  •  •  • 
yi  =  Y ,  /3  !  +  X,7i  +  u. 


to  obtain 


pi  im 


i  A  l 
f01 

(3 1 1 

|y1'X(X'X)-1X,\ 

| 

+  A  plim  T ~ 1 

1  A 

\  7  i 

7  i  i 

\  X,'  1 

(50) 


where  A  stands  for  the  asymptotic  covariance  matrix  of  the 
Theil's  G2SLS  estimator. 

GLIML  is  consistent  if  the  second  term  on  the  right 
hand  side  of  the  above  equation  is  asymptotically  equal  to 
zero.  The  second  term  can  be  written  as 


A ' 


pi  im 


\  X  i 


(51  ) 


Since  by  assumption  X's  are  uncorrelated  with  the  et,  the 
above  term  will  be  equal  to  zero  and  thus  GLIML  is 
consistent. 

Since  plim  X  =  1,  it  follows  from  (49)  that  GLIML 
esimator  is  identical  to  G2SLS  in  the  limit.  Therefore, 

GLIML  will  have  the  same  asymptotic  efficiency  as  the  G2SLS. 


29See  Goldberger ( 1 964 , 1 965 ) . 


IV.  METHODOLOGY  OF  RESEARCH  AND  CRITERIA  OF  EVALUATION 


Most  of  the  knowledge  about  econometric  estimators 
concerns  their  large  sample  properties.  However,  an 
econometric  practitioner  typically  deals  with  small  samples. 
Therefore,  it  is  of  great  interest  to  inquire  into  the  small 
sample  properties  of  these  estimators.  The  relative  lack  of 
knowledge  about  the  small  sample  performance  of  estimators 
stems  from  the  inherent  difficulties  in  their  analytical 
derivation.  This  fact  is  the  main  reason  for  resorting  to 
numerical  analysis  or  computer  simulation  as  an  alternative 
to  mathematical  or  analytical  derivation.  Naylor(1971,  p.2) 
defines  simulation  as 

a  numerical  technique  for  conducting  experiments 
with  certain  types  of  mathematical  models  which 
describe  the  behavior  of  a  complex  system  on  a 
digital  computer  over  extended  periods  of  time. 

Monte  Carlo  simulation  is  a  technique  of  performing  sampling 

experiments  on  a  model  using  random  or  pseudorandom  numbers. 

It  is  a  capital  intensive  (Summers,  1965)  approach  to  the 

problem  of  assessing  the  properties  of  estimators  when 

analytical  or  labor-intensive  methods  become  either  very 

complex  or  impossible. 

However,  the  results  of  the  Monte  Carlo  simulation 
cannot  be  treated  without  qualification.  As  Rubinstein ( 1 98 1 , 
p. 1 0 )  explains 

Simulation  is  indeed  an  invaluable  and  very 
versatile  tool  in  those  problems  where  analytical 
techniques  are  inadequate.  However,  it  is  by  no 
means  ideal.  Simulation  is  an  imprecise  technique. 

It  provides  only  statistical  rather  than  exact 
results,  and  only  compares  alternatives  rather  than 
generating  the  optimal  one.... 
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The  inherent  imprecision  of  the  Monte  Carlo  method  is 
partly  the  result  of  the  sampling  error  that  can  be  high 
depending  on  the  number  of  replications.  This  defect  is 
mainly  due  to  cost  constraints  that  restrict  the  number  of 
replications  per  experiment.  Moreover,  traditionally,  Monte 
Carlo  studies  have  used  the  direct  simulation  method  that 
often  produced  results  that  were  indeterminate  and  sometimes 
contradictory . 

These  problems  can  be  partially  resolved  by  using  more 
efficient  techniques  than  the  conventional  direct  simulation 
method  discussed  by  Dhrymes ( 1 970 )  or  Smith(1973).  In  fact, 
numerous  techniques  have  been  proposed30  for  increasing  the 
efficiency  of  Monte  Carlo  simulations  over  and  above  what 
can  be  obtained  by  conventional  or  direct  simulation.  In  the 
following  we  discuss  the  three  main  Monte  Carlo  techniques 
used  in  econometric  research. 

A.  Direct  Simulation  or  Crude  Monte  Carlo 

Let  e  ,,...,  en  be  a  sample  of  N  independent  random 
numbers  rectangularly  distributed  between  zero  and  one.31 
Define  Gj  as  a  function  of  e's  such  that 

G  i  =  f  (  e  j  )  i  =  1  ,  .  .  .  ,  N  (  1  ) 

Therefore,  the  quantities  Gj  are  also  independent  random 

30See  Nay lor ( 1 966 , 1 97 1 ) ,  Hammersley  and  Handscomb ( 1 964 ) , 

Rubi nste i n ( 1 98 1 ) . 

3 ’Note  that  frequently  the  random  normal  numbers  are  derived 
from  rectangularly  distributed  random  numbers  lying  in  the 
interval  (0,1)  using  appropriate  transformations.  As  we 
shall  discuss  later,  the  random  normal  number  generator  used 
in  this  study  also  uses  this  method. 
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variables  with  mean  equal  to  G  and  variance  of  a2(G).  Then, 
an  unbiased  estimator  of  G  can  be  obtained  as 

G=  (ZG,)/N  (2) 

I 

/\ 

The  variance  of  G  will  be  equal  to 

Var(G)  =  a  2  ( G  )  /  N  (3) 

In  practice,  a2(G)  is  usually  unknown  and  has  to  be 
estimated.  Using  the  sample  variance  of  Gj,  the  unbiased 
estimator  of  o2(G )  will  be  equal  to 

S2 (G  j )  =  Z(Gi-G) 2/(N-1 )  (4) 

i 

A 

Therefore,  the  unbiased  estimator  of  the  variance  of  G  is 

Var(G)  =  S  2 ( G  i  ) /N  =  [ 1 /N ( N- 1 ) ]  I(Gi-G)2  (5) 

A  _ 

G  is  referred  to  as  a  "Crude  Monte  Carlo"  estimate  of  G. 

In  direct  simulation,  as  equation  (3)  indicates, 
cutting  down  the  standard  error  by  a  factor  of  n  requires 
increasing  the  number  of  replications  by  a  factor  of  ^  2  . 

This  number  can  be  drastically  cut  down  by  using  more 
efficient  simulation  techniques. 

In  direct  Monte  Carlo  simulation,  no  refinement  in  the 
choice  of  the  random  numbers  are  being  made.  Variance 
reducing  techniques  can  be  viewed  as  the  methods  which  use 
known  information  about  the  problem  (Rubinstein,  p . 1 2 1 ) .  Two 
of  these  variance  reduction  techniques  are  control  variates 
and  antithetic  variates. 


« 
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B.  Control  Variate  Technique 

The  sampling  error  in  the  direct  or  crude  Monte  Carlo 
simulation  of  G  arises  from  the  variation  of  ( G  j  -  G)  over 
replications  as  e(  runs  over  0<e ,<1 .  The  control  variates 
technique  is  a  method  which  reduces  this  variation  and 
therefore  improves  the  efficiency  of  simulation. 

Mikhail ( 1 972 , 1 975)  applied  the  control  variate 
technique  for  two  stage  and  three  stage  least  squares  and 
full  information  maximum  likelihood  for  static  models  and 
achieved  considerable  efficiency  gains  in  estimating  means 
and  variances.  Hendry  and  Harr i son ( 1 974 )  employed  control 
variates  to  estimate  the  biases  of  the  OLS  and  2SLS  for 
various  specification  of  a  dynamic  structure  with 
autocor related  errors  and  also  obtained  significant 
efficiency  gains.32 

The  essence  of  the  control  variate  technique  is  to  find 

A 

an  auxiliary  statistic  C  such  that  G  and  C  are  positively 
correlated  and  the  distribution  of  C  is  known.  Using  C  we 

A 

control  as  much  as  possible  of  the  variation  of  G 
analytically  and  estimate  the  remainder.  Through  this 
approach  we  minimize  the  variation  of  the  parameters  to  be 
estimated  by  absorbing  most  of  their  variations  using 
control  variates.  Hence,  the  resulting  Monte  Carlo  errors 
will  be  considerably  reduced. 

A 

Consider  C  to  be  a  control  variate  for  G  with  known 
expectation  E(C)  and  variance  Var(C).  Then,  we  use  C  to 


3  2 


See  also  Hendry ( 1 97 9 ) ,  Hendry  and  Srba(1977). 
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construct  an  estimator  for  G  with  a  smaller  variance  than 

A 

the  estimator  G.  Therefore,  instead  of  investigating  the 
direct  simulation  estimator  G,  one  can  use 

G*  =  G  -  0(C-  E ( C ) )  (6) 

where  ©  is  a  parameter  that  will  be  determined  shortly.  From 
(6)  we  have 

E (G* ) =E (G ) 

which  is  an  unbiased  estimator  of  G.  Moreover,  the  variance 

a 

of  G*  will  be 

Var (G* )  =  Var(G)  +  02Var(C)  -  20Cov(G,C)  (7) 

A 

which  will  be  smaller  than  Var(G)  if 

02 Var ( C )  <  20Cov(G,C)  (8) 

Therefore,  the  value  of  0  can  be  found  in  such  a  way  to 

/s 

minimize  the  Var(G*).  This  value  will  be  equal  to 

0  =  [ Cov (G , C ) /  Var (C) ]  (9) 

A 

Substituting  ©  in  (7)  yields 

Var (G* )  =  ( 1 -Rk 2 )  Var(G)  (10) 

A 

Where  Rk  is  the  correlation  coefficient  between  G  and  C. 

A 

Thus  the  higher  the  Rk ,  the  smaller  the  Var(G*)  and 
therefore  the  greater  the  reduction  in  variance. 

Equation  (10)  provides  an  important  intuitive 
explanation  of  the  control  variate  technique.  It 

A 

demonstrates  that  if  nothing  is  known  about  the  estimator  G, 
then  no  part  of  that  estimator  can  be  solved  analytically, 

A 

i.e.,  Rk=0,  and  therefore  the  variance  of  G*  will  be  equal 

A 

to  Var(G).  In  other  words,  there  will  be  no  reduction  m 
variance  and  the  control  variate  technique  is  not 
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applicable.  As  information  about  G  increases  a  greater  part 
of  its  variation  can  be  solved  analytically  through  C  and 
thus  Rk  increases  which  in  turn  decreases  the  variance  of 

A  ,  A 

G* .  If  everything  is  known  about  G,  its  whole  variations  can 
be  explained  analytically  and  Rk  becomes  equal  to  one.  This 
means  that  in  the  case  of  perfect  knowledge,  the  variance  of 

A 

the  estimator  G*  is  equal  to  zero. 

The  main  problem  for  application  of  this  method  is  to 

A 

find  a  C  that  is  highly  correlated  with  G  and  whose 
distribution  can  be  analytically  derived.  In  practice,  it  is 

A 

usually  the  lack  of  knowledge  about  the  distribution  of  G 
which  necessitates  the  experiment.  Therefore,  it  is  highly 
unlikely  to  find  an  estimator  C  whose  distribution  is  close 

A 

to  G . 

C.  Antithetic  Variates  Technique 

The  antithetic  variates  technique  is  another  method  for 
reducing  the  sampling  variation  of  a  Monte  Carlo  experiment. 
It  is  due  to  Hammersley  and  Mor ton ( 1 956 ) . 3 3  The  basic  idea 
of  the  Antithetic  variates  technique  is  to  find  another 

A  A 

statistic  G**  having  the  same  expectation  as  G  and  which  has 

A 

a  strong  negative  correlation  with  G.  Then  the  combination 
of  these  two  will  result  in  a  estimate  with  the  same 

A 

expectation  as  G  but  smaller  variance.  It  can  be  viewed  as  a 

stratification  of  the  range  of  the  random  errors  into  two 

opposite  (negatively  related)  sections  and  then  sampling 

33  See  also  Hammersley  and  Handscomb ( 1 964 ) , 

Rubinstein ( 1 98 1 ) ,  Hammersley  and  Mauldon ( 1 956 ) . 
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equally  from  each  half.34  Therefore,  the  two  sets  of 
estimators  corresponding  to  two  negatively  related  sections 
would  mutually  compensate  each  other's  variations  without 
affecting  the  unbiasedness  of  the  result. 

This  method  was  employed  by  Mi kha i 1 ( 1 97 2 , 1 97 5 ) .  Hendry 
and  Tr i vedi ( 1 972 )  also  used  this  method  in  a  study  of 
maximum  likelihood  estimation  of _  di f f erence  equations  with 
moving  average  errors.  The  results  were  a  considerable  gain 
in  efficiency  in  estimating  the  variance  and  reducing  the 
small  sample  bias.  Hendry  and  Trivedi  reported  gains  in 
efficiency  of  about  300  to  500  percent  in  estimation  of  the 
biases  in  the  structural  parameters. 

The  antithetic  method  proceeds  as  follows: 

A  .A 

Let  G**  be  the  antithetic  variate  for  G  ,  then  define 

AAA 

G*  =  ( G ♦  "t-  G)/2 

which  will  be  an  unbiased  estimator  of  G,  and  its  sampling 
variance  will  be  equal  to 

Var (G* )  =  1/4  Var(G)+1/4  Var(G**)+1/2  Cov(G**,  G)  (11) 

A  A 

If  the  cov(G**,  G)  can  be  made  strongly  negative,  then  the 

A 

sampling  variance  of  G*  will  be  drastically  reduced  below 

A 

the  random  sampling  outcome,  Var(G).  For  the  choice  of 
antithetic  variate,  recall  that  Gi  =  f  ( e  j  )  ,  where  the  Cj's 
were  independently  rectangularly  distributed  random  numbers 
between  zero  and  one.  One  can  use  e  ■,  *  =  1  —  e  s  to  get  an 

A 

antithetic  estimator  G**  which  is  likely  to  be  highly 

A 

negatively  correlated  with  G  .  The  random  numbers  e,*  will 


3  4 


See  Hendry  and  Har r i son , 1 974 ,  p.156. 
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also  be  rectangularly  distributed  in  the  range  zero  and  one. 
Furthermore,  random  normal  variables  corresponding  to  ei* 
are  frequently  derived  using  the  transformat  ion 3 5 

U,*  =  L  e  j*  -  6  (12) 

I 

These  random  normal  variables  will  have  the  same  mean  and 
variance  as  the  random  normal  variables  corresponding  to  e  ■,  . 
Moreover,  using  the  above  transformation,  the  random  normal 
variables  corresponding  to  e ;  will  be  equal  in  magnitude  to 
the  ones  corresponding  to  e  ■,*  with  opposite  sign. 

The  relative  efficiency  of  two  Monte  Carlo  methods  can 
be  defined  as  follows.  Assume  two  Monte  Carlo  methods 
require  Tt  and  T2  units  of  computer  time  or  number  of 
replications,  respectively.  Also  assume  that  the  resulting 
estimates  of  the  parameter  under  consideration  have  sampling 
variances  a,2  and  a22.  Method  one  is  said  to  be  more 
ef  f ic ient  i  f 

E  =  (T !  or ,  2  )/(T2a2  2  )  <  1 

Clearly,  using  the  antithetic  variate  method  requires 
doubling  of  the  number  of  replications  relative  to  the  one 
required  by  the  crude  Monte  Carlo  method.  Therefore,  the 

/V 

antithetic  estimator  G*  will  be  more  efficient  than  the 
direct  simulation  only  if 


3 5We  used  the  REGEN  program  developed  by  Haitovsky  and 
Jacobs(1972)  to  generate  random  normal  variables.  To 
generate  the  uniform  random  numbers  REGEN  uses  the  RANNO 
(Harvard  Computing  Center)  program  which  utilizes  the  Power 
Residue  Method  to  generate  random  numbers  between  zero  and 
one.  To  produce  the  normal  random  numbers  REGEN  employs 
subroutine  GAUS ,  which  applies  the  Central  Limit  theorem  to 
12  uniform  random  numbers  obtained  from  RANNO  to  generate 
one  normal  number. 


/ 


Var (G* )  <  1/2  Var(G) 


(13) 


Rubinstein ( 1 98 1 ,  pp. 135-6)  proved  that  under  the  condition 
that  G(e i )  is  a  continuous  monoton ically  non-increasing 
(non-decreasing)  function  with  continuous  first  derivatives, 
using  e,  and  ei*  as  antithetic  variates  will  always  satisfy 
the  inequality  (13). 

The  merit  of  the  antithetic  variate  technique  lies  in 
its  simplicity.  AS  Hammersley  and  Handscomb  argued(1964, 

p .  6  1  ) 

From  the  practical  viewpoint,  the  mathematical 
conditions  that  a  Monte  Carlo  technique  has  to 
satisfy  govern  its  efficiency.  As  in  the  case  of 
importance  sampling  and  control  variates,  we  are 
usually  unable  to  satisfy  the  conditions  in  the 
theoretically  optimum  way,  and  we  have  to  be  content 
with  some  compromise.  When  the  conditions  are  fairly 
loose  and  flexible,  it  is  easier  to  reach  a  good 
compromise.  This  is  the  case  with  antithetic 
variates;  in  practice  it  is  relatively  easy  to  find 
a  negatively  correlated  unbiased  estimator  of  G, 
usually  easier  than  it  is  to  find  an  equally 
satisfactory  control  variate  or  importance  function. 
Accordingly,  the  antithetic  variate  method  tends  to 
be  more  efficient  in  practice. 

In  the  present  work  we  shall  employ  both  the  direct  Monte 

Carlo  technique  and  the  antithetic  variate  method  in 

simulating  the  small  sample  properties  of  econometric 

estimators . 


D.  Criteria  of  Evaluation 

To  compare  the  finite  sample  properties  of  estimators, 
there  is  a  need  for  some  measures  by  which  the  performance 
of  each  estimator  can  be  evaluated.  These  measures  or  the 
criterion  of  goodness  represent  the  summary  characteristics 
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of  the  sampling  distribution  of  the  estimator. 

A 

If  Gjj  represents  the  direct  simulation  or  the  result 
of  the  antithetic  variate  method  of  the  estimate  of  the  jth 
parameter  Gj  in  the  ith  replication,  then  the  evaluation 
criteria  used  in  this  study  are  the  following: 

1.  The  mean  of  the  sample  estimate 

-  *  * 

G  j  =  L  G i  j /  N  (14) 

l  - 1 

''N  .  ^ 

2.  The  bias  of  G  j  j  is  equal  to  G j  j -G j ,  therefore  the  mean 

bias  of  the  sample  estimate  is 

w  ^  Cl 

BIAS ( j )  =  I(Gij-Gj)/  N  =  Gj  -  Gj  (15) 

/-i 

3.  The  Root  Mean  Square  Error  ( RMSE )  which  takes  into 
account  bias  and  dispersion  of  the  estimates  at  the  same 
time  and  represents  the  discrepancy  between  the  sample 
means  and  the  corresponding  true  parameter  values;  it  is 
defined  as 

RMSE(j)  =  [I(Gi j-  Gj ) 2/  N] 1 7 2  (16) 

4.  Defining  the  Mean  Squared  Error(MSE)  matrix  as 

MSE  =  [£(G,  -  G)'(Gi  -  G)]/N  (17) 

/-I 

where  N  is  the  number  of  replications  and  Gj  is  a  K.1 
vector  whose  jth  coefficient  is  Gjj.  Trace(MSE)  can  be 
defined  as 

K 

Trace (MSE )  =  I  diag(MSE)  (18) 

i-' 

where  K  is  the  number  of  parameters  to  be  estimated. 

The  Trace(MSE)  index  provides  a  aggregate  measure 
used  to  compare  the  relative  efficiency  of  different 
estimators. 

5.  A  good  summary  statistic  of  the  aggregate  bias  of  the 
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estimated  coefficients  is  the  Euclidean  distance.  If 
there  are  K  parameters  in  the  equation  under 
consideration,  the  Euclidean  distance  is  equal  to 

d  =  [ £ ( G  j -  G  j ) 2 ] 1 7  2  (19) 

r~< 

6.  Finally,  a  good  summary  index  of  the  aggregate 

dispersion  is  the  determinant  of  the  Mean  Squared 
Errors(MSE)  matrix  suggested  by  Dhrymes ( 1 97 1 ) .  Using 
det(MSE),  the  relative  efficiency  of  two  methods  can  be 
measured  by 

e  =  det(MSE,)/  det(MSE2)  (19) 

If  e<1,  we  can  conclude  that  method  one  is  more 
efficient  than  method  two. 

In  using  the  above  statistics  we  implicitly  assumed  the 
existence  of  the  first  two  moments  of  the  estimators' 
distribution.  Since  the  estimators  are  the  ratio  of  sums  of 
random  variables,  it  is  a  possibility  that  their  moments  are 
not  well  behaved.  For  instance,  the  existence  of  a  number  of 
extremely  aberrant  parameter  estimates  can  cause  the 
variance  to  tend  to  infinity.  In  these  circumstances  the 
comparison  of  estimators  based  on  the  moments  which  do  not 
exist  is  wrong  and  potentially  misleading. 

This  problem  was  first  discussed  by  Basmann  (1960,1961) 
and  Nagar ( 1 959 , 1 960 ) .  Nagar  worked  in  a  very  general 
context.  He  is  mainly  concerned  with  the  approximate 
distributions  of  K-class  and  double  K-class  estimators. 
Basmann' s  work  is  more  specific  and  primarily  deals  with  the 
exact  sampling  distributions  of  several  two  stage  least 


. 
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squares  estimators.  He  demonstrates  many  instances  where  the 
distributions  of  the  estimators  do  not  possess  a  finite  mean 
or  variance.  However,  as  we  shall  discuss  later,  the  first 
two  moments  of  the  generalized  classical  linear  estimators 
exist  for  the  models  considered  in  this  study. 

Apart  from  the  comparison  of  the  efficiency  of  the 
estimators  on  the  basis  of  the  above  mentioned  statistics, 
an  econometric  practitioner  is  likely  to  be  interested  in 
the  reliability  of  the  estimators  for  hypothesis  testing. 
Park  and  Mi tchell ( 1 980 )  studied  the  performance  of  the 
different  estimators  in  hypothesis  testing  when  the 
classical  linear  model  had  autocorrelated  disturbances  and 
the  explanatory  variables  were  trended.  They  found  that  all 
the  estimators  they  considered  seriously  underestimate 
standard  errors  and  made  estimated  coefficients  appear  to  be 
much  more  significant  than  they  actually  were.  This  study 
will  attempt  to  assess  whether  their  results  carry  over  to 
simultaneous  equation  systems. 


V.  EXPERIMENTAL  DESIGN  AND  RESEARCH  STRATEGY 


In  order  to  investigate  the  small  sample  properties  of 
the  limited  information  estimators,  the  following 
three-equation  model,  which  is  the  same  as  the  one  used  by 
Cragg ( 1 967a , 1 967b , 1 968 ) ,  was  employed36 
Y  B  +  X  r  =  u 
U  =  U  _  i  R  +  E 

where  Y  is  a  T.3  observation  matrix  on  the  endogenous 
variables,  X  is  a  T.7  observation  matrix  of  the  exogenous 
variables  whose  first  element  at  every  observation  is  unity, 
U  is  the  T.3  matrix  of  the  disturbances,  R  is  a  3.3  diagonal 
matrix  of  the  autocorrelation  coefficients  and  E  is  the  T.3 
matrix  of  random  errors.  B  and  V  are  3.3  and  7.3  matrices  of 
structural  coefficients  to  be  estimated. 

The  B  and  V  matrices  that  are  used  in  all  the 
experiments  in  this  part  are37 


1.00 

-0.64 

-0.22  \ 

-0.72 

1  .00 

0.00 

\  0.00 

-0.17 

1.00  / 

36  Most  of  the  Monte  Carlo  studies  have  used  a  two  equation 
model.  In  his  study  Mosbaek ( 1 970 )  argued  that  "the  three 
equation  size  was  selected  because  pilot  Monte  Carlo  runs 
showed  that  they  revealed  the  essential  properties  of  larger 
models  whereas  two  equation  models  do  not". 

37  Cragg ( 1 967a , 1 967b )  used  five  different  sets  of  structural 
parameters.  His  second  set  is  employed  in  this  study.  The 
reason  for  the  choice  of  structure  two  was  that  it  was  the 
only  structure  that  did  not  result  in  occurrence  of  any 
singular  or  apparently  singular  matrices  or  "nonsensical" 
results.  See  Cragg ( 1 967a ) ,  Table  3,  p.82. 
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32 

0.65 

0.00 

0.00 

0 . 52 

0.00 

o 

o 

• 

o 

r! 

=  - 

18 

0.00 

0.68 

0.00 

0 . 67 

0.00 

0.34 

37 

0.00 

0.95 

0.39 

0.00 

0.18 

o 

o 

• 

o 

The 

number 

of  a 

priori 

restrictions  in 

terms 

of  excluded 

exogenous  variables  in  each  equation  is  greater  than  the 
number  of  endogenous  variables  included  in  the  same 
equation.  This  implies  that  the  Basmann ( 1 960 , 1 96 1 ) 
conditions  for  the  existence  of  the  first  two  finite  sample 
moments  are  satisfied.38 

As  was  mentioned  in  Chapter  I,  the  statistical 
characteristics  of  different  estimators  are  likely  to  be 
sensitive  to  the  specification  of  the  process  that  generates 
the  exogenous  variables.  Specifically,  the  recent  Monte 
Carlo  studies  by  Maeshiro  (1976,1979),  Park  and 
Mi tchell ( 1 98 1 )  and  Beach  and  Mackinnon ( 1 978 )  have  emphasized 
the  effect  of  trended  data  on  the  estimators  which  omit  the 
first  observation.  In  general,  Monte  Carlo  studies  have  used 
three  types  of  specifications  for  the  exogenous  variables; 
namely 

1.  A  stationary  autoregressive  process  of  the  form3 9 


3 8Note  that  the  Basmann  results  were  derived  for  models 
without  autocorrelation.  The  presence  of  autocorrelation  , 
when  the  model  does  not  contain  any  lagged  endogenous 
variables,  does  not  create  any  additional  problem  for 
identification.  In  fact,  using  additional  information  about 
the  structure  of  the  disturbances  would  aid  identification. 
Therefore,  since  the  Basmann  results  depend  on  the  degree  of 
over ident i f icat ion  of  the  structural  equation,  we  can  expect 
them  to  hold  true  for  the  present  case.  For  discussion  of 
identification  and  autocorrelation,  see  F.M.  F i sher ( 1 966 ) , 
M.Deistler  (1975,76),  M.Deistler  and  J. Schrader  (1979), 

C . Hs iao  (1981). 

39This  specification  was  used  by  Rao  and  Gr i 1 iches ( 1 969 )  and 
Spi tzer (1979). 
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Xt  =  x  xt . ,  +  vt 

where 

v  t  - N  (  0  ,  av2  ) 

2.  A  non-stochastic  autoregressive  process  such  as40 

Xt  =  X  X, _ !  =  X1  X0 

3.  A  stochastic  trended  process  as41 

X t  =  exp ( Xt )  +  w  t 

where 

w  t  N  (  0  ,  a  J  ) 

We  have  employed  all  three  types  of  specification.42  We  have 
specifically  examined  Maeshiro's  and  Park  and  Mitchell's 
conjecture  that  the  results  of  the  Monte  Carlo  studies  which 
have  not  used  trended  data  (i.e.,  that  of  Rao  and  Griliches) 
are  questionable.  For  this  purpose  we  have  examined  a  number 
of  variations  of  trended  stochastic  and  non-stochastic 
specifications  (case  2  and  case  3).  In  particular,  we  have 
examined  cases  in  which  trend  exist  in  most  of  the  exogenous 
variables  of  the  simultaneous  equation  system  and  cases  in 
which  the  trended  variables  appear  only  in  the  structural 
equation  under  consideration.  The  primary  reason  for  this 
choice  is  that  some  estimators  such  as  Fair  employ  only  the 
exogenous  variables  that  are  included  in  the  structural 
equation  under  consideration  as  instruments.  Other 
estimators  such  as  Theil  use  all  the  explanatory  variables 

40  This  specification  was  used  by  Maeshi ro ( 1 976 , 1 979 ) . 

41  This  specification  was  used  by  Beach  and  Mackinnon ( 1 978 ) . 

42  All  the  computations  were  carried  out  on  the  university 
of  Alberta  computer  facilities.  The  random  numbers  were 
generated  using  Regen-Computer  Program  developed  by 
Haitovsky  and  Jacobs ( 1 972 ) . 
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as  instruments.  Therefore,  the  existence  of  trend  in  the 
exogenous  variables  that  are  excluded  from  the  structural 
equation  under  consideration  is  likely  to  affect  the  latter 
type  of  estimators  more  seriously. 

To  minimize  the  probability  of  exclusion  of  any 
potentially  good  estimator,  a  pilot  experiment  which  covers 
all  the  possible  variations  of  different  estimators  was 
undertaken.  The  results  of  the  pilot  run  was  carefully 
analysed  and  the  estimators  or  the  variations  of  them  which 
performed  poorly  relative  to  other  estimators  were  excluded 
from  the  subsequent  experiments.  This  strategy  had  the 
following  advantages:  1-  It  covered  all  potential 
estimators;  2-  It  minimized  the  total  cost  of  the  study; 

3-It  minimized  the  amount  of  statistics  to  be  analysed  and 
thus  makes  the  results  more  comprehensible. 

The  pilot  experiment  undertaken  in  this  part 
investigates  the  performance  of  13  estimators  when  the 
sample  size  is  equal  to  30,  the  autocorrelation  coefficient 
is  equal  to  0.6  and  the  exogenous  variables  are  specified  to 
be  trended  and  non-trended. 

The  disturbance  terms,  Et,  are  jointly  normally 
distributed  with  a  zero  mean  and  variance-covariance  matrix 
of  the  form 


| 22 . 00 

7.13 

15.24  \ 

\ 

=  7.13 

20 . 00 

16.77  j 

\  1  5 . 2  4 

16.77 

2  5.00 

0  =  E(E,Et ' ) 
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Each  complete  experiment  consists  of  200  replications, 
100  "thetic"  and  100  antithetic.  Given  the  covariance 
structure  0,  a  set  of  Eit  ( i  =  1  ,2,3,  t=1,...,T)  was 
generated.  Given  the  structure  of  the  X  matrix  and  the 
values  of  the  structural  parameters  (B's,  T's,  r's)  the 
endogenous  variables  were  calculated  and  the  structural 
parameters  were  estimated.  These  estimated  coefficients  were 
stored  as  thetic  estimates.  The  Eit's  were  then  replaced  by 
(-Eit)  and  new  Ut  and  Yt  were  generated.  The  structural 
parameters  were  estimated  again  by  the  same  estimators  and 
stored  as  "antithetic"  estimates.  The  statistics  based  on 
the  direct  simulation  and  the  antithetic  method  are  then 
reported.  The  summary  table  of  the  13  estimators  employed  in 
the  pilot  study  for  the  estimation  of  the  structural 
coefficients  of  the  first  equation  of  the  above  model  and 
their  computer  programs  are  presented  in  appendix  II.43 

Table  5-1  categorizes  the  estimators  on  the  basis  of 
the  reduced  form  they  use  and  the  number  of  observations 
they  utilize.  Estimators  on  the  same  row  of  Table  5.1  are 
basically  the  same  except  for  the  reduced  form  they  employ 
in  their  first  stage  estimation  and  the  number  of 
observations  (T  or  T- 1 )  they  utilize  and  thus  are  directly 
comparable. 


43  All  the  computer  programing  was  done  in  the  APL  language 
using  University  of  Alberta  computer  facilities. 


Augmented  Reduced  Ordinary  Reduced 

Form  Form 
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Note  that  T  is  the  number  of  observations  used  by  each  method. 
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A.  Monte  Carlo  Results  of  the  Pilot  Experiment 

The  following  chart  shows  the  structure  of  the  pilot 
study . 


Autocorrelation 

No .  of 

Data  Spec i f ica t ion 

of  the  Structural 

Obs  . 

Di sturbances 

n 

ii 

o 

• 

CTi 

N=  3  0 

Stochastic  Non-trended 

r  2  =  0 . 9 

N=  3  0 

Stochastic  Trended 

t-t 

ii 

o 

• 

to 

N=  3  0 

Non-Stochastic  Trended 

Relative  Efficiency  of  Different  Monte  Carlo  Methods 

Turning  now  to  the  results,  Table  5-2  presents  the  mean 

bias  and  its  standard  error  for  the  Fair  and  Theil's  G2SLS 

estimator.44  The  relative  efficiency  of  the  two  methods 

(defined  as  the  ratio  of  the  variances  of  the  estimated 

coefficients  weighted  by  the  number  of  replications  for  both 

estimators)  is  shown  in  the  last  two  columns  of  Table  5-2. 

The  antithetic  estimates  are  based  on  twice  as  many 

replications  as  the  direct  simulation  estimates.  However,  it 

can  be  seen  that  generally  the  antithetic  method  has 

resulted  in  gains  in  efficiency  over  and  above  what  could  be 

achieved  by  employing  the  direct  simulation  method  with 

twice  as  many  replications.  In  general,  we  achieved  an 

average  gain  in  efficiency  of  about  15  to  28  percent  for  the 

44  The  exogenous  variables  were  stochastic  non-trended  as 
defined  in  the  next  section. 


TABLE  5-2:  Relative  Efficiency  of  Different  Methods 


82 


Q) 

o 

CD 

o 

CM 

CD 

in 

00 

in 

T— 

CO 

O 

O 

O 

O 

o 

to 

L. 

CD 

CM 

in 

O 

CD 

CD 

0 

■<— 

CM 

CD 

to 

in 

00 

CO 

-1— 

(0 

■F 

Li- 

O 

O 

O 

o 

o 

CO 

c 

0) 

T3 


(/> 

c 

o 

o 


O 

CM 

CM 

1- 

CD 

CM 

LU 

CD 

o 

o 

o 

O 

O 

£_ 

■F 

CO 

0) 

0 

0 

CO 

O 

o 

O 

o 

O 

o 

"O 

XI 

-F 

c 

-F 

(0 

3 

•i— 

•t- 

CO 

CD 

T- 

CD 

05 

■*— 

CM 

-F 

£_ 

(0 

T- 

*r- 

■*— 

o 

CM 

T- 

c 

c 

(0 

-r- 

0 

t r 

< 

> 

co 

CD 

o 

o 

o 

O 

o 

1/5 

CM 

u 


(0 
3 
c r 
ai 


h- 

CM 

CD 

CM 

'S’ 

CD 

c 

LU 

CD 

o 

o 

O 

o 

o 

o 

L. 

4-J  -i- 

00 

T— 

O 

O 

o 

o 

o 

3 

0  -e 

-F 

o  ra 

0 

L.  ■— 

o 

CM 

c- 

CD 

'S’ 

CD 

3 

•-  3 

CO 

in 

t— 

T— 

T— 

CM 

T— 

£_ 

O  E 

(0 

-F 

-r— 

*r- 

■»- 

o 

O 

o 

o 

o 

CO 

00 

03 

i 

i 

1 

1 

1 

ai 

r 


£ _ 
(0 


a> 


'S’ 

CM 

CM 

T- 

'S’ 

CM 

-F 

a 

LU 

CD 

O 

O 

o 

o 

O 

a 

-r-  CO 

0 

-F  0 

if) 

-» — 

o 

o 

o 

o 

o 

x: 

0  -F 

u 

r  0 

— 

-F  *r- 

'S’ 

T- 

o 

'S’ 

T- 

T- 

CO 

3 

L. 

CO 

hx 

CM 

T— 

LD 

CM 

-r- 

«F  0 

0 

x: 

+ 

c  > 

CD 

C 

o 

o 

O 

O 

-F 

W 

< 

CO 

1 

1 

1 

1 

<3* 

B  x* 

10  + 
a) 

—  o 

n 

TO  X 


in 

in 

+ 

c 

CM 

CD 

CM 

CD 

CD 

0 

o 

LU 

'S’ 

O 

O 

O 

O 

O 

XI 

00^ 

-H 

■F 

O  +•> 

(/) 

CM 

O 

o 

O 

O  ’ 

o 

a)  ro 

r— 

£-  f— 

> — 

+ 

—  3 

CO 

in 

'S’ 

CD 

CD 

^r 

CM 

0 

«- 

Q  E 

0 

in 

CM 

T- 

T~ 

in 

CM 

00 

-»— 

c 

t n 

CQ 

in 

O 

o 

o 

O 

o 

T— 

1 

1 

i 

1 

1 

■F 

+ 

0 

x: 

CJ 

4- 

-F 

0 

4- 

w 

II 

0 

o 

03 

CO 

o 

u 

L. 

0 

CJ 

-F 

>* 

O 


Fair  and  Theil  estimators,  respectively.  Therefore,  in 
analysing  the  results,  we  focus  on  the  results  of  the 
antithetic  estimates  (Tables  5-3,  5-4). 
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Analysis  of  the  Results 

Stochastic  Non-t rended  Exogenous  Variables : 

In  this  part  the  exogenous  variables  follow  a 
stationary  autoregressive  process  of  the  form  used  by  Rao 


4  5 


and  Griliches  with  the  following  specification 

X,  x2  x3  x4  xE  x6 
Mean  10  12  14  8  9  11 


Standard 


Error 


5.78  6.93  8.10  4.62  5.20  6.36 


X 


0.6  0.5 


0.4 


0 . 3 


0.2 


0.  1 


Evaluating  estimators  on  the  basis  of  their  bias  and 
RMSE,  one  can  summarize  the  results  given  in  Table  5.3  in 
the  following  points: 

1.  As  was  expected,  all  estimators  under-estimated  the 
autocorrelation  coefficient.  Those  estimators  which  use 
the  Prais-Winsten  method  for  estimating  the 
autocorrelation  coefficient  had  the  lowest  bias. 

2.  Using  the  Beach  and  Mackinnon  cubic  formula  instead  of 
the  Prais-Winsten  method  for  estimating  the 


45  The  ratio  of  the  mean  to  the  standard  error  of  the 
exogenous  variables  are  the  same  as  the  ones  employed  by 
Goldfeld  and  Quandt(1972)  in  their  study  of  the  small  sample 
properties  of  the  estimators  designed  for  estimation  of  the 
simultaneous  equation  models  characterized  by 
autocorrelation.  This  specification  of  the  exogenous 
variables  will  be  the  same  in  all  the  experiments  that  use 
stochastic  non-trended  exogenous  variables. 


TABLE  5-3:  Estimated  Bias  Using  Stochastic  Non-Trended  Exogenous  Variables 

( r  =0 . 6 ,  T  =  30) 
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*This  index  does  not  include  the  intercept  term. 


TABLE  5-4:  Estimated  Root  Mean  Square  Error  Using  Stochastic  Non-Trended  Exogenous  Variables 

( r  =0 . 6 ,  T  =  30) 
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autocorrelation  coefficient  in  the  Theil  estimator  did 
not  have  any  significant  effect  on  the  estimated 
coef f ic ient s .  This  result  confirms  the  findings  in  the 
single  equation  context  that  omission  of  the  Jacobian  of 
the  transformation  from  the  likelihood  function  would 
not  materially  affect  the  estimated  coefficients.  Since 
both  methods  on  average  resulted  in  approximately  the 
same  number  of  iterations  (i.e.,  4.3),  it  would  seem 
reasonable  to  choose  the  Pra i s-Wi nsten  formula  over  the 
cubic  one  due  to  its  relative  simplicity  and  lower  cost. 

3.  Comparison  of  different  estimators  on  the  basis  of  their 
index  of  overall  bias  measured  by  the  Euclidean 
distance46  indicates  that  Brundy  and  Jorgenson  estimator 
had  the  lowest  and  Fair  estimator  had  the  highest  index 
of  overall  bias.  In  general,  the  index  of  overall  bias 
showed  that  none  of  the  estimators  performed  very  poorly 
relative  to  others  in  estimation  of  the  structural 
coefficients  except  the  intercept.  The  Fair  estimator 
seriously  under-estimated  the  constant  term. 

4.  The  relative  efficiency  of  employing  different  reduced 
forms  was  investigated  by  comparing  different  versions 
of  the  Brundy  and  Jorgenson  estimator.  Fair-Brundy, 
which  uses  the  augmented  reduced  form,  had  lower  RMSE 
for  all  the  estimated  coefficients  except  the  intercept. 

46Note  that  we  have  omitted  the  constant  term  from 
calculation  of  this  index.  This  was  mainly  due  to  the  fact 
that  the  relative  large  biases  in  the  constant  term  could 
seriously  distort  the  overall  picture  of  the  performance  of 
the  estimators. 
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However,  it  must  be  noted  that  in  two  instances  the 
Fair-Brundy  estimator  resulted  in  extreme  outliers  that 
were  not  acceptable  and  were  therefore  removed  from  the 
list  of  estimated  coefficients.  Thus,  Fair-Brundy 
estimated  results  are  on  the  basis  of  196  "thetic"  and 
antithetic  estimates.  Taking  into  account  these  two 
outliers  renders  the  Fair-Brundy  estimator  inferior  to 
the  Brundy  estimator. 

5.  Investigation  of  the  relative  efficiency  of  utilizing 
the  first  observation  was  carried  out  by  comparing  Theil 
G2SLS  with  the  modified  generalized  two  stage  least 
squared  estimator,  Brundy(T-l)  with  Brundy(T)  and 
GLIML(T-I)  with  GLIML(T).  Theil  and  GLIML(T)  estimators 
showed  lower  RMSE  than  MG2SLS  and  GLIML(T-I)  for  all  but 
one  coefficient.  Brundy(T)  estimator  almost  invariably 
dominated  Brundy(T-l).  These  results  suggest  that  using 
the  first  observation  does  increase  the  small  sample 
efficiency  of  estimators. 

6.  The  small  sample  effect  of  the  order  in  which  the  two 
problems  of  simultaneity  and  autocorrelation  are  handled 
in  the  Theil  type  estimators  was  examined  by  comparing 
The i 1 (G2SLS )  ,  Theil(II)  and  its  instrumental  variable 
version  Theil(lV).  The i 1 ( G2SLS )•  had  the  lowest  aggregate 
bias  and  almost  invariably  dominated  the  other  two 
estimators  on  the  RMSE  criteria.  This  result  confirms 
our  findings  in  Chapter  III  that  the  second  version  of 
the  Theil  estimator  is  numerically  different  from  Theil 
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G2SLS  due  to  the  existence  of  the  additional  terms  that 
converge  asymptotically  to  zero. 

7.  Comparison  of  the  ALIML  and  GLIML  suggests  that  GLIML 
which  is  derived  under  a  different  set  of  assumptions 
performs  better  in  small  samples.  GLIML  showed  much 
lower  RMSE  for  all  the  estimated  coefficients. 

Finally,  an  overall  comparison  of  different  estimators, 
on  the  basis  of  the  sum  of  the  MSE  of  the  estimated 
coefficients  (excluding  the  intercept),  suggests  the 
superiority  of  Theil  type  estimators,  followed  by  MG2SLS , 
Fair's  estimator,  the  LIML  estimators  and  Brundy  Jorgenson 
type  estimators. 

Trended  Exogenous  Variables 

In  this  part  we  employ  two  different  specifications  of 
trended  data,  i.e.,  trended  stochastic  and  trended 
non-stochast ic  exogenous  variables. 

1-  Trended  stochast ic  exogenous  variables : 

To  test  the  effect  of  trended  data  on  the  performance 
of  the  limited  information  estimators  we  considered  two 
cases:  one  in  which  four  of  the  exogenous  variable  ,  only 
one  of  which  (i.e.  X, )  appears  in  the  structural  equation 
under  consideration,  are  trended.  Then  the  effect  of  trend 
existing  only  in  the  structural  equation  under  consideration 
(i.e.,  and  X4 )  is  examined.  The  exogenous  variables  in 
both  cases  follow  the  same  specification  used  by  Beach  and 


Mack i nnon . 


' 
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Case  1 : 

X,  and  X2 

are  the  same  as 

in  the  previous 

exper iment 

X3  =  e° ■ 

10t+  Wl 

w,  N  (  0  , 

0.0009) 

X4=  e°  • 

1  1  '  +  w2 

w  2  N  (  0  , 

0.0025) 

X  5  =  e° ' 

1  2  t  . 

+  W  3 

w3  —  N ( 0 , 

0.0049) 

X6  =  e°  • 

1  3  t  ,  .. 

+  W  u 

w  n  ■ — N  (  0  , 

0.008  1  ) 

The  experiments  in  the 

single  equation 

context  suggest 

substantial 

gains  in  the  efficiency  of 

the  estimators 

employ  all  T  observations  relative  to  the  ones  which  omit 
the  first  observation.  When  the  data  is  trended,  however, 
Tables  5-5  and  5-6  show  that  this  gain  in  efficiency  is  not 
materializing  in  simultaneous  equation  systems.  These  Tables 
show  that  the  relative  efficiency  of  the  first  observation 
for  Theil  and  GLIML(T)  and  Brundy(T)  estimators  has 
increased  slightly  relative  to  the  non-trended  case. 
Nevertheless,  the  overall  picture  has  not  changed 
dramatically.  The  important  observation  is  that  Fair's 
estimator  becomes  superior  to  Theil' s  estimator.  This 
picture  was  more  or  less  the  same  when  we  change  the  degree 
of  autocorrelation  to  0.9  (Table  5-7). 

Case  2:  In  this  case  only  the  X's  that  appear  in  the 
structural  equation  under  consideration  are  trended;  the 
rest  are  stochastic  non-trended  as  in  the  case  of  the  first 
exper iment . 

X,  =  eOJ5t  +  w, 


w 


Xu  =  e 


0  .  1  9  t 


+  w 


—  N (  0 ,  0.0009) 
w2‘ —  N(0,  0.0025) 


TABLE  5-5:  Estimators  Using  T-1  Observations  and  Stochastic  Trended  Data 

( r =0 . 6 ,  T=30) 
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TABLE  5-6:  Estimators  Using  T  Observations  and  Stochastic  Trended  Data,  r=0.6,  T=30 

( r =0 . 6 ,  T=30) 
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The  results  summarized  in  Table  5-8  shows  that  the 
Theil  estimator  dominates  Fair  and  MG2SLS  estimators  on  the 
basis  of  trace(MSE)  criteria. 

2-  Trended  non-stochastic  exogenous  variables: 

The  third  set  of  experiments  was  conducted  with 
non-stochastic  trended  data.  We  considered  three  cases.  One 
in  which  the  exogenous  variables  appearing  in  the  structural 
equation  under  consideration  were  downward  trended.  Then  the 
same  experiment  was  repeated  assuming  that  the  exogenous 
variables  were  upward  trended.  Finally,  the  case  in  which 
four  of  the  exogenous  variables  were  trended  was  considered. 
The  results  show  that  only  in  the  cases  where  both  of  the 
exogenous  variables  in  the  equation  under  consideration  were 
strongly  trended,  did  Theil's  estimator  perform  better  than 
Fair  and  MG2SLS  and  this  difference  was  specially  noticeable 
for  the  coefficients  of  those  two  explanatory  variables. 

Case  1 :  All  X's  were  stochastic  non-trended  as  in 
experiment  one  except  X,  and  X4  which  were  non-stochastic 
trended  as 

X,  =  0.6  X1(., 

X4  =  0.8  X« , _ , 

Table  5-9  shows  the  comparison  between  Fair,  Theil  and 
MG2SLS  estimators  when  the  autocorrelation  coefficient  is 
equal  to  0.6.  The  efficiency  of  Theil  estimator's  in 
estimating  the  coefficients  of  X,  and  X„  is  much  higher  than 
both  the  Fair  and  MG2SLS  estimators.  Generally,  Theil's 
estimator  performed  much  better  than  the  Fair  estimator. 


TABLE  5-8:  Root  Mean  Square  Error  of  Different  Estimators  Using  Stochastic  Trended  Data 

( r  =0 . 6 ,  T  =  30) 
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TABLE  5-9:  Root  Mean  Square  Error  of  Different  Estimators  Using  Non-Stochastic  Downward  Trended  Data 
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Case  2: In  this  case  the  X's  that  appear  in  the  first 
equation  were  upward  trended,  while  the  rest  of  the 
exogenous  variables  were  left  the  same  as  in  the  first 
exper iment . 

X,  =  1 . 10  X, , . , 

Xu  =  1.13  X  4 , _ 1 

As  Table  5-10  shows  the  Theil  estimator  outperformed  the 
MG2SLS  and  Fair  estimators. 

Case  3:  In  this  case  X5  and  X6  are  stochastic 
non-trended  as  in  the  first  experiment  and 
X,=  0.6  X,,., 

X 2  =  0.7  X2  , _ ! 

X 3  =  0.8  X3  , -  1 
X4  =  0.9  X4 ; . 1 

Table  5-11  shows  that  the  Theil' s  estimator  which  employs 
all  T  observations  is  performing  only  slightly  better  than 
its  analogue  MG2SLS  which  omits  the  first  observation.  Also 
as  was  observed  in  case  one,  the  Theil  estimator  performed 
better  than  the  Fair  estimator,  possibly  due  to  the  presence 
of  high  degree  of  trend  in  the  explanatory  variables  that 
appear  in  the  first  equation. 

To  summarize,  the  above  results  show  that  when  the 
explanatory  variables  were  trended,  the  Theil  estimator 
which  uses  all  T  observations,  always  performed  better  than 
its  analogue,  MG2SLS,  which  utilizes  T- 1  observations.  These 
results  also  showed  that  whenever  the  exogenous  variables 
included  in  the  structural  equation  under  consideration  were 


TABLE  5-10:  Root  Mean  Square  Error  of  Different  Estimators  Using  Non-Stochastic  Upward  Trended  Data 

( r  =0 . 6 ,  T  =  30) 
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strongly  trended  while  the  other  X's  were  stochastic 
non-trended,  Fair  estimator  did  poorly  relative  to  the  Theil 
estimator.  The  performance  of  Fair  relative  to  the  Theil 
estimator  improved  whenever  the  degree  of  trend  in  the 
exogenous  variables  outside  the  structural  equation  under 
consideration  increased.  These  results  can  be  explained  by 
recalling  that  the  Fair  estimator  uses  as  instruments  only 
the  X's  that  appear  in  the  structural  equation  under 
consideration.  On  the  other  hand,  the  Theil  estimator  uses 
as  instrument  all  the  X's  included  in  the  system  . 

Therefore,  the  existence  of  trend  in  the  exogenous  variables 
outside  the  structural  equation  under  consideration  does  not 
directly  affect  the  performance  of  the  Fair  estimator.  On 
the  other  hand,  the  Fair  estimator  is  directly  affected 
whenever  the  only  trended  X's  are  the  ones  included  in  the 
structural  equation  under  consideration. 

Perhaps  a  more  important  observation  is  that  the  above 
results  seem  to  suggest  that  the  effect  of  the  trended  data 
on  the  performance  of  the  simultaneous  equation  estimators 
is  different  from  that  on  the  single  equation  estimators. 
Appendix  III,  further  investigates  this  important 
observation.  In  Appendix  III,  the  findings  of  Taylor(1981) 
for  the  single  equation  models  are  reviewed.  Then,  the 
observed  differences  between  the  simultaneous  equation  and 
the  single  equation  models  are  explained.  It  is  shown  that 
the  results  of  the  single  equation  methods,  regarding  the 
importance  of  the  first  observation  when  the  explanatory 
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variables  are  trended,  are  derived  on  the  basis  of  very 
restrictive  models.  Finally,  the  investigation  in  Appendix 
III  clarifies  our  findings  in  the  simultaneous  equation 
system  that  there  does  not  exist  a  dramatic  difference 
between  the  limited  information  methods  which  employ  T  or 
T- 1  observations  when  the  exogenous  variables  are  trended. 
The  findings  of  Doran  and  Gr i f f  i  1 1 s (  1  982 )  also  support  our 
results.  They  conducted  experiments  using  the  seemingly 
unrelated  regression  model  with  first  order  autoregressive 
disturbances.  Their  experiment  was  similar  to  that  of 
Maeshi ro ' s (  1  98 0  )  .  They  were  particularly  interested  in 
finding  the  relative  efficiency  of  the  estimators  which 
employ  the  first  observation  with  those  that  omit  it.  They 
concluded  that  "in  general,  we  feel  that  Maeshi ro ( 1 980 )  has 
overstated  the  case  for  inclusion  of  the  initial  transformed 
observation...".47  Therefore,  following  conventional  Monte 
Carlo  studies,  in  the  rest  of  the  experiments  discussed  in 
the  next  section,  we  shall  only  employ  stochastic 
non-trended  exogenous  variables. 

B.  General  Results  of  the  Monte  Carlo  Experiments 

Table  5-12  shows  the  structure  of  the  remaining 

experiments  conducted  in  this  part  of  the  study.  The  seven 

estimators  employed  in  this  part  are:  Fair,  Theil  G2SLS 

( The i 1 ( PW) ) ,  MG2SLS ,  ALIML,  GLIML ,  2SLS  and  The i 1 ( Tr ue ) . 4 8 

47Doran,  H.E.,  and  W.E.  Gr i f f i tt s ( 1 982 ) ,  p.26. 

48Note  that  from  the  family  of  Theil  type  estimators  we  only 
included  the  one  which  performed  best,  even  though  the  other 
two  excluded  versions  performed  better  than  some  of  the 
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TABLE  5-12:  Structure  of  the  Monte  Carlo  Experiments 
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The  2SLS  method  estimates  the  structural  coefficients  with 
the  restriction  that  the  autocorrelation  coefficient  is 
equal  to  zero.  This  is,  in  fact,  a  mi sspec i f icat ion  which 
takes  account  of  the  simultaneity  but  ignores  the 
autocorrelation.  The  Theil(True)  method  estimates  the 
structural  coefficients  under  the  assumption  of  known 
autocorrelation  coefficient. 

In  what  follows,  we  shall  compare  the  estimators 
according  to  their  aggregate  bias  as  measured  by  the 
Euclidian  distance,  normalized  det(MSE)  and  normalized 
t race (MSE ) . 4 5  The  detail  statistics  used  to  construct  these 
indices  are  given  in  Appendix  IV. 

1.  Comparison  using  bias  as  a  criterion  of  evaluation 

Table  5-13  presents  the  results  of  the  antithetic 

estimates  of  the  aggregate  bias  of  different  estimators  when 

sample  size  is  30  and  60.  For  a  low  value  of  autocorrelation 

coefficient  (r^O.2)  and  the  sample  size  of  30,  the  GLIML 

and  Theil  estimators  have  the  lowest  aggregate  bias,  while 

the  Fair  estimator  has  the  highest.  For  higher  values  of  the 

autocorrelation  coefficient  and  the  sample  size  of  30, 

MG2SLS  and  Theil  estimators  have  the  lowest,  while  the  Fair 

4 8 ( con t T d ) est ima tor s  which  we  considered  in  our  final 
experiment.  This  choice  was  made  due  to  our  interest  in 
covering  a  variety  of  methods  rather  than  comparing 
different  versions  of  one  method.  Also  the  inclusion  of  the 
LI ML  estimators  makes  our  work  comparable  to  that  of  Beach 
and  Mackinnon ( 1 978 )  in  the  single  equation  context. 

49  To  remind  the  reader,  we  refer  to  normalized  det(MSE)  as 
the  index  suggested  by  Dhrymes ( 1 97 1 )  to  signify 
Normalized  det(MSE)  =  det(MSEk)/  det(MSEt) 
where  MSE t  is  the  MSE  of  the  Theil(True)  estimator. 
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coefficients  biases  while  that  of  other  estimators  include  the  estimated 
bias  of  the  autocor r e 1  a t i on  coefficient  as  well. 
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estimator  again  has  the  highest  aggregate  bias.  The 
important  observation  is  that  the  aggregate  bias  of  2SLS 
which  ignores  the  autocorrelation,  increases  as  the 
autocorrelation  coefficient  increases.  Asymptotic  theory 
suggest  that,  in  the  absence  of  lagged  endogenous  variables, 
2SLS  which  takes  account  of  simultaneity  but  ignores 
autocorrelation  problem,  yields  consistent  estimates  of  the 
structural  coefficients.  In  other  words,  the  consistency  of 
2SLS  does  not  depend  on  the  degree  of  autocorrelation.  Hence 
it  should  not  behave  differently  from  other  consistent 
estimators,  in  responce  to  the  change  in  the  degree  of 
autocorrelation.  However,  the  above  results  suggest  that  the 
small  sample  bias  of  2SLS  is  positively  related  to  the 
degree  of  autocorrelation. 

As  was  noted  before,  the  index  of  the  aggregate  bias 
does  not  include  the  constant  term.  In  estimation  of  the 
constant  term,  for  all  degrees  of  autocorrelat in ,  the  Fair 
estimator  had  the  highest  bias.  This  high  degree  of 
under-estimation  of  the  constant  term,  can  have  serious 
effects  on  the  usage  of  this  estimator  for  prediction 
purposes . 

In  estimation  of  the  autocorrelation  coefficient,  the 
Theil  estimator  which  uses  the  Pra i s-Winsten  formula  had  the 
lowest  bias  for  all  degrees  of  autocorrelation. 

As  sample  size  increased  from  30  to  60,  all  estimators 
(except  Fair)  experienced  reduction  in  their  bias.  In  other 
words,  increasing  the  sample  size  to  60  was  enough  for  most 
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of  the  estimators  to  show  their  asymptotic  unbiasedness. 
Fair's  estimator  again  had  the  highest  aggregate  bias.  The 
Theil  estimator  showed  the  lowest  bias  in  estimating  the 
autocorrelation  coefficient. 

2.  Comparison  using  normalized  det(MSE)  criterion 

A  more  illuminating  picture  of  the  performance  of 
different  estimators  can  be  observed  by  examining  Table  5-14 
in  which  we  have  normalized  the  det(MSE)  of  different 
estimators  by  dividing  it  by  its  corresponding  det(MSE)  of 
the  Theil(true)  estimator.  When  sample  size  is  equal  to  30 
we  can  observe  that  for  low  degrees  of  autocorrelation, 
i.e.,  ri=0.2,  2SLS  has  the  lowest  normalized  det(MSE), 
followed  by  the  Theil  and  MG2SLS  estimators.  But  as  the 
autocorrelation  coefficient  increases,  the  performance  of 
2SLS  deteriorates  and  it  becomes  inferior  to  the  Theil 
estimator  when  the  autocorrelation  coefficient  becomes  very 
high,  i.e.,rr0.9.  This  behaviour  of  the  2SLS  confirms  the 
asymptotic  theory  that  suggest  a  serious  loss  of  efficiency 
for  those  estimators  which  ignore  the  autocorrelation 
problem . 

Comparison  of  the  Theil  and  MG2SLS  estimators  reveals 
that  the  Theil  estimator  which  utilizes  all  T  observations 
has  lower  det(MSE)  than  the  MG2SLS  which  omits  the  first 
observation,  for  all  degrees  of  autocorrelation.  Therefore, 
utilization  of  the  first  observation  seems  to  have  positive 
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effect  on  the  performance  of  the  Theil  estimator.50 
Comparison  of  GLIML  and  ALIML  revealed  that  the  GLIML 
estimator  outperformed  ALIML  for  all  degrees  of 
autocorrelation  when  the  sample  size  was  30. 

Table  5-14  also  demonstrates  that  the  above  picture 
stays  more  or  less  the  same  as  the  sample  size  increases 
from  30  to  60  observations.  For  a  very  low  degree  of 
autocorrelation,  2SLS  again  performed  best,  but  its 
performance  deteriorated  rapidly  as  the  autocorrelation 
coefficient  increases.  It  became  extremely  inefficient  when 
the  degree  of  autocorrelation  reached  0.9.  The  performance 
of  ALIML  and  GLIML  improved  considerably  (especially  for 
r=0.6  and  r=0.9)  as  the  sample  size  increased.  Finally,  the 
efficiency  gain  caused  by  employment  of  the  first 
observation  does  not  disappear  even  when  the  sample  size  was 
increased  to  60.  The  Theil  estimator  which  utilizes  all  T 
observations  is  still  superior  to  the  MG2SLS  estimator,  for 
all  degrees  of  autocorrelation.  This  fact  suggests  that 
employment  of  the  first  observation  does  increase  the 
efficiency  of  estimators  even  for  moderately  large  sample 
sizes. 


50  The  fact  that  some  of  the  entries  in  Tables  5-14  and  5-15 
are  equal  or  less  than  one,  suggest  equivalence  or  higher 
efficiency  of  a  particular  estimator  relative  to  Theil(true) 
in  estimating  some  of  the  coefficients. 
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3.  Comparison  using  normalized  trace(MSE)  criterion 

A  slightly  different  picture  appears  when  we  compare 
estimators  on  the  basis  of  the  normalized  trace(MSE).  It 
should  be  mentioned  that,  as  was  the  case  with  normalized 
det(MSE),  the  normalized  trace  index  does  not  include  the 
constant  term.  We  omitted  the  intercept  term  since  the 
complete  trace  was  often  dominated  by  the  MSE  of  the 
intercept  term.  Since  researchers  are  often  interested  in 
obtaining  reliable  estimate  of  the  slope  coefficients,  the 
complete  trace  index  could  provide  misleading  assessment. 

Comparing  estimators  on  the  basis  of  the  normalized 
trace,  we  can  see  (Table  5-15)  that  for  sample  size  of  30, 
2SLS  is  no  longer  best  even  for  low  degrees  of 
autocorrelation,  i.e.,  r=0.2.  The  relative  performance  of 
2SLS  deteriorates  rapidly  as  the  autocorrelation  coefficient 
increased.  This  picture  stayed  unchanged  as  we  increased  the 
sample  size  to  60.  In  most  cases,  except  for  sample  size  of 
30  and  r  of  0.9,  the  Theil  estimator  performed  marginally 
better  than  MG2SLS.  This  suggests  a  gain  in  efficiency  due 
to  the  employment  of  the  first  observation  even  for  a 
relatively  large  sample  size,  i.e.,  n=60.  Another  important 
observation  is  that,  except  for  one  case(  r=0.6,  N=60),  the 
relative  performance  of  the  Fair  estimator  deteriorated  as 
the  autocorrelation  coefficient  or  the  sample  size 
increased.  This  obviously  suggests  that  the  gain  in 
efficiency  achieved  by  Theil(True)  estimator,  which  is  used 
as  a  base,  has  been  much  greater  than  that  of  the  Fair 
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Note  this  index  does  not  include  the  intercept 


estimator  when  the  sample  size  was  doubled.  This  observation 
cast  some  doubt  on  the  usefulness  of  the  Fair  estimator  for 
estimation  of  models  character ized  by  autocorrelation,  at 
least  for  the  case  of  stochastic  non-trended  data. 

To  summarize,  the  above  results  provide  important 
information  in  guiding  a  potential  econometric  practitioner 
in  the  choice  of  estimator  for  models  characterized  by 
autocorrelation.  Theil's  estimator  generally  dominated  all 
other  methods.  Employment  of  the  first  observation  improved 
the  performance  of  the  Theil  estimator.  However,  the 
efficiency  gain  was  not  as  large  as  was  suggested  by  Park 
and  Mi tchell ( 1 980 )  and  Maeshiro ( 1 979 ) .  This  results  are  in 
line  with  Doran  and  Gr i f f i ths ’  (  1  982  )  findings  in  the  single 
equation  context.  Generalized  LIML,  even  though  inferior  to 
Theil  estimator,  dominated  ALIML  in  most  cases  which 
suggests  that  imposing  the  restriction  that  all 
autocorrelation  coefficients  are  equal,  even  when  this  is 
not  true,  does  not  impose  very  large  costs  in  terms  of  MSE. 
The  Fair  estimator  which  is  a  commonly  used  method  for 
estimation  of  the  autocor related  models  performed  poorly.  In 
most  cases  it  proved  to  be  inferior  even  to  2SLS  which 
ignores  the  autocorrelation. 

C.  The  Performance  of  Estimators  in  Hypothesis  Testing 

Performance  of  different  estimators  in  test  of 
hypothesis  is  as  important  for  an  econometric  practitioner 
as  obtaining  efficient  estimates  of  the  structural 


coef  f ic ient  s . 


There  are  number  of  tests  of  hypotheses  proposed  in  the 
literature.  Maddala ( 1 974 )  using  a  two  equation  model, 
compared  the  power  functions  of  the  tests  of  significance 
proposed  by  Dhrymes ( 1 969 ) ,  Richardson  and  Rohr(1971)  and 
Anderson  and  Rubin(1949)  with  that  of  the  conventional  tests 
of  significance  based  on  asymptotic  theory.  He  found  that 
the  conventional  test  performed  better  than  the  alternatives 
proposed . 

However,  it  must  be  noted  that  the  comparison  of  the 
performance  of  different  estimators  in  hypothesis  testing 
has  not  received  enough  attention.  Park  and  Mi tchell ( 1 980 ) 
using  the  conventional  tests  of  significance,  examined  the 
performance  of  different  single  equation  estimators  when  the 
model  possessed  autocorrelation.  They  found  that  none  of  the 
single  equation  estimators  they  considered  performed  well  in 
hypothesis  testing.  They  found  that  the  percentage  of  times 
the  true  null  hypothesis  was  rejected  was  much  greater  than 
the  nominal  level  of  significance.  They  attributed  this 
problem  to  the  underestimation  of  the  covariance  matrix  even 
when  the  Pra i s-Winsten  estimator,  which  they  found  to  give  a 
accurate  estimate  of  the  standard  error,  was  used  .  They 
considered  this  problem  as  serious  as  obtaining  inefficient 
estimates  of  the  structural  coefficient. 

Following  Park  and  Mitchell,  we  focus  on  the  number  of 
times  each  method  leads  to  the  occurrence  of  a  Type  I  error 
at  the  5  pecent  level  of  significance.  Detailed  statistics 


concerning  the  Type  I  error  and  the  power  of  the  test 
statistics  for  different  sample  sizes  and  different  degrees 
of  autocorrelation  are  given  in  the  Appendix  VI.  In  this 
Chapter  we  shall  only  provide  aggregate  measures;  namely, 
the  average  Type  I  error  and  the  average  power  of  the  test 
statistics  accross  equations.  However,  we  shall  accompany 
these  averages  with  the  statistics  concerning  the  range  of 
the  Type  I  error  and  the  power  of  tests  across  coefficients. 
Table  5-16  shows  the  comparison  of  the  estimators  in 
hypothesis  testing  for  different  values  of  autocorrelation 
coefficient  and  sample  sizes  of  30  and  60.  The  figures  in 
parantheses  are  the  range  of  the  Type  I  errors.  For  a  sample 
size  of  30  we  can  observe  that  for  low  degree  of 
autocorrelation,  i.e.,  r=  0.2,  GLIML  performed  best  and  was 
followed  by  2SLS,  Theil  and  MG2SLS  estimators.  But  as  the 
autocorrelation  coefficient  increased,  Theil  and  its 
modified  version  outperformed  all  others.  In  fact,  their 
observed  Type  I  error  moved  toward  the  nominal  level  of 
significance  as  the  autocorrelation  coefficient  increased. 
The  reverse  of  this  process  occurred  in  the  case  of  Fair's 
estimator.  Its  observed  Type  I  error  increased  as  the 
autocorrelation  coefficient  increased  from  0.2  to  0.9. 

Generally,  for  sample  size  of  30  the  observed  Type  I 
errors  of  all  estimators  (except  for  Theil  when  r=0.9)  was 
much  greater  than  the  nominal  level  of  significance, 
suggesting  that  the  tails  of  the  test  statistic  distribution 
were  much  thicker  than  those  given  by  the  t-di s t r i but i on . 
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This  result  is  in  line  with  the  conclusions  reached  by  Park 
and  Mitchell  in  the  single  equation  case. 

As  the  sample  size  increased  to  60  (Table  5-16),  all 
estimators  (except  Fair)  experienced  a  significant  reduction 
in  the  actual  Type  I  error.  In  fact,  their  observed  Type  I 
error  became  closer  to  the  nominal  level  of  significance. 

For  the  low  autocorrelation  coefficient  of  0.2,  all 
estimators  (except  Fair)  performed  more  or  less  the  same. 

The  increased  sample  size  had  a  positive  effect  on  the 
performance  of  ALIML .  In  fact,  ALIML  had  the  lowest  Type  I 
error  for  the  sample  size  of  60  and  the  autocorrelation 
coefficients  of  0.2  and  0.6.  The  performance  of  2SLS  and 
Fair  estimators  deteriorated  as  the  autocorrelation 
coefficient  increased  to  0.9.  These  results  suggest  that 
perhaps  the  main  problem  with  the  application  of  2SLS  and 
Fair  estimators  to  the  autocor related  models  is  that  they 
result  in  much  greater  probability  of  Type  I  error  than  the 
nominal  level  of  significance  indicates. 

Table  5-17  presents  the  power  of  the  test  of 
significance  at  5  percent  significant  level  when  the  sample 
size  is  equal  to  30  and  60.  Again  the  figures  in  parantheses 
show  the  range  of  the  power  of  the  test  across  coefficients. 
For  a  sample  size  of  30  we  can  observe  that  the  ALIML  showed 
the  highest  power  for  all  degrees  of  autocorrelation.  The 
power  of  the  2SLS  estimator  deteriorated  as  the 
autocorrelation  coefficient  increased,  while  that  of  the 
Theil,  MG2SLS,  Fair  and  GLIML  remained  more  or  less  the 
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same.  The  relatively  high  power  of  the  ALIML,  Fair  and  GLIML 
estimators  is  misleading.  The  relatively  high  power  of  these 
estimator  can  be  attributed  to  their  underestimation  of  the 
standard  error  that  in  turn  leads  to  the  high  probability  of 
Type  I  and  low  probability  of  Type  II  errors.  It  is 
conceivable  that  if  the  nominal  level  of  significance  be 
lowered  to  the  point  where  the  observed  Type  I  error  of 
these  estimators  be  equal  to  that  of  the  Theil  estimator  at 
5  percent  level,  their  power  will  be  much  lower  than  the 
above  picture  suggests. 

As  the  sample  size  increased  to  60  (Table  5-17),  the 
observed  power  of  all  estimators  (except  Fair)  increased. 

The  power  of  2SLS  decreased  as  the  autocorrelation 
coefficient  increased.  ALIML  showed  the  highest  power. 

To  summarize,  the  results  of  this  Chapter  suggest 
important  guidelines  for  econometric  practitioners.  In 
practice  a  researcher  usually  deals  with  a  small  or  moderate 
sample  sizes.  For  the  sample  sizes  of  about  30  observations 
and  when  the  autocorrelation  coefficient  is  very  low,  i.e., 
r=0.2,  2SLS  estimator  which  ignores  the  autocorrelation 
performs  as  well  as  any  other  estimator  which  corrects  for 
the  presence  of  serial  correlation  in  the  disturbance  term. 
As  the  autocorrelation  coefficient  irises,  the  performance  of 
2SLS  deteriorates  rapidly.  Fair's  estimator  which  is 
probably  the  most  commonly  known  method  did  not  perform  much 
better  than  the  2SLS.  In  fact,  in  most  cases  it  was  inferior 
to  the  2SLS  which  ignores  the  autocorrelation.  The  Theil 


estimator  which  is  completely  ignored  in  empirical  research 
turns  out  to  be  the  best  not  only  in  the  estimation  of  the 
structural  coefficients,  but  also  in  the  test  of  hypothesis. 

For  the  larger  sample  sizes  of  about  60,  the  Theil 
estimator  performed  best  in  the  estimation  of  the  structural 
coefficients  but  was  outperformed  by  ALIML  in  the  hypothesis 
testing . 


VI.  LIMITED  INFORMATION  METHODS  OF  ESTIMATION  IN  THE 


PRESENCE  OF  LAGGED  ENDOGENOUS  VARIABLES 

A.  Statement  of  the  Problem 

Estimation  problems  associated  with  the  presence  of 
autocorrelation  in  simultaneous  equation  systems  in  the 
absence  of  lagged  endogenous  variables  were  discussed  in 
Chapters  I  and  II.  In  this  chapter  we  will  consider  problems 
of  estimation  related  to  the  presence  of  lagged  endogenous 
variables  in  simultaneous  equation  systems.  We  shall  also 
analyse  different  methods  appropriate  for  estimation  of  this 
type  of  system. 

The  presence  of  lagged  endogenous  variables  among  the 
explanatory  variables  of  a  simultaneous  equation  system  with 
autocor related  errors  creates  additional  estimation 
problems.  Consider  the  following  system 

Y  B  +  X*  T  =  U  (  1  ) 

U  =  U_  !  R  +  E 

where  Y  is  the  T.G  matrix  of  observations  on  the  endogenous 
variables.  X*  is  the  T.K  matrix  of  observations  on  the 
predetermined  variables,  i.e.,X*  =  (Y.  •,  ,  X),  where  Y.-,  is  a 

matrix  of  endogenous  variables  lagged  one  period,  and  X  is 
the  matrix  of  purely  exogenous  variables  of  the  system.  B  is 
the  G.G  matrix  of  structural  coefficients  associated  with 
the  current  endogenous  variables.  T  is  the  K.G  matrix  of 
coefficients  of  the  predetermined  variables.  U  is  the  T.G 
matrix  of  disturbances. 


The  first  structural  equation  of  this  system  can  be 
written  as 

Y  i  =  Y i Pi  +X i *  7 i  +  u  i  ( 2 ) 

u  i  =  r  u i  , _ i  +  e! 
x  !  *  =  (  Y  ,  ,  _  ,  f  X  ,  ) 

where  y ,  and  Y ,  are  appropriate  sub-matrices  of  Y;  X1f  r,  u- 
and  e,  are  appropriate  sub-matrices  of  X,  R,  U,  and  E, 
respectively.  Y1(_i  is  a  subset  of  Y,  lagged  one  period,  and 
X,  is  a  sub-matrix  of  X. 

In  the  absence  of  lagged  endogenous  variables,  problems 
of  estimation  of  system  (2)  are  the  correlation  between  Y, 
and  u1f  and  the  autocorrelation  of  the  error  term,  u, .  The 
presence  of  lagged  endogenous  variables  in  a  simultaneous 
equation  system  without  autocorrelation  does  not  create  any 
additional  estimation  problems.  The  only  problem  would  be 
the  correlation  between  the  current  endogenous  variables 
appearing  among  the  explanatory  variables  of  the  system  and 
the  error  term.  There  will  not  be  any  correlation  of  the 
lagged  endogenous  variables  and  the  current  error  terms. 
However,  when  the  current  error  terms  are  correlated  with 
their  past  values,  they  will  be  also  correlated  with  the 
lagged  endogenous  variables.  Therefore,  the  presence  of  the 
lagged  endogenous  variables  poses  additional  estimation 
problems  when  a  model  is  characterized  by  autocorrelation. 

In  the  absence  of  lagged  endogenous  variables, 
application  of  standard  techniques  which  ignored  the 
autocorrelation  property  of  the  error  terms  resulted  in 
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consistent  but  inefficient  estimates  of  the  structural 
coe f f  ic i en t s  .  However,  when  the  structural  equation  includes 
lagged  endogenous  variables,  standard  techniques  no  longer 
provide  consistent  estimates  of  the  structural  coefficients. 
This  is  due  to  the  fact  that  in  this  case  the  predetermined 
variables  are  also  correlated  with  the  error  terms. 
Therefore,  estimators  appropriate  for  estimation  of 
autocor related  models  with  lagged  endogenous  variables  are 
those  which  take  into  account  both  the  simultaneity  and  the 
correlation  of  the  predetermined  variables  with  the  error 
terms . 

B.  Methods  of  Estimation 

There  are  a  number  of  estimators  designed  for 
estimation  of  the  model  (2).  Two  of  these  estimators,  namely 
Fair  and  modified  Brundy  and  Jorgenson  were  discussed  in 
Chapter  II.  The  structure  of  these  estimators  remains  the 
same  except  that  in  this  case  the  X  matrix  is  not  purely 
exogenous,  but  contains  lagged  endogenous  variables. 
Asymptotic  properties  of  these  two  estimators  are  also  the 
same  as  they  were  in  the  case  with  no  lagged  endogenous 
variable . 

Theil  Estimator 

A  limited  information  method  that  can  be  employed  for 
estimation  of  the  model  (2)  is  a  modified  version  of  Theil's 
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generalized  two  stage  least  squared  estimator  (G2SLSM) . s 1  A 
modification  is  needed  since  the  procedure  outlined  in 
Chapter  II  for  the  G2SLS  estimator  is  not  applicable  to 
models  with  lagged  endogenous  variables.  The  reason  is  that 
in  the  absence  of  lagged  endogenous  variables,  a  consistent 
estimate  of  the  autocorrelation  coefficient  of  the 
structural  equation  under  consideration  was  used  to 
construct  the  Pra i s-Winsten  transformation  matrix.  This 
matrix  was  then  employed  to  transform  the  ordinary  reduced 
form  of  the  system  as 

PY=PXn1+PUB~1  (3) 

A  consistent  estimate  of  PY  could  then  be  obtained  by 
application  of  OLS  to  (3).  However,  if  X  includes  lagged 
endogenous  variables,  the  transformed  X  would  be  correlated 
with  the  error  term  PUB'1  which  includes  Uj  and  ui(_i  for 
i=1,...,G.52  Therefore,  a  consistent  estimate  of  PY  cannot 
be  obtained  through  this  approach.  To  remedy  this  problem  we 
propose  the  following  two  alternatives: 

1-  Write  the  transformed  ordinary  reduced  form  of  the 
system  ( 1 )  as 


5 ’Note  that  G2SLSM  is  different  from  the  modified 
G2SLS (MG2SLS )  introduced  in  Chapter  II.  MG2SLS  was  similar 
to  G2SLS  except  for  the  omission  of  the  first  observation. 
The  G2SLSM  estimator  discuused  in  this  chapter  is  the 
analogue  of  G2SLS  which  takes  account  of  lagged  endogenous 
variables  present  in  the  structural  equation  under 
consideration . 

52Note  that  The  transformed  X's  will  be  uncorrelated  with 
the  error  terms  only  if  all  the  autocorrelation  coefficients 
of  the  structural  equations  are  equal,  i.e.,  PUB"1=EB"1. 
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p  y  =  p  y  _ , n ! *  +  pxn2*  +  pub'1 

or 

•  •  • 

y  =  y_ 1n1 *  +  x  n2*  +  v  (4) 

Obtain  a  consistent  estimate  of  PY  by  applying  the 
instrumental  variable  method  to  (4),  using  a  proper 

A  A. 

instrument  for  Y_  i  ,  i.e.,  Y.^X.iII.  If  W,  is  the  set  of 

instruments  so  selected,  we  will  have 
/\  •  •  •  • 
n,v  =  ( w '  x  * )  ~ 1  w '  y 

where 

W  =  (PW,  PX) 

• 

X*  =  (PY- 1  PX) 

n,v  =  (n ! * 1  n2* ’ ) ? 

therefore 

A  ,  .A 

Y=Y.1n1*+xn2*  (5) 

2-Alternat i vely ,  we  can  obtain  a  consistent  estimate  of  PY 
by  employing  the  augmented  reduced  form  of  the  system  ( 1 ) 

y  =  y . t  n i +y . 2  n2  +xn  3+x. ! n4  +eb  ~ 1  (6) 

A  consistent  estimate  of  PY  can  be  obtained  by  transforming 
(6)  and  estimating  the  resultant  transformed  equation  by 
OLS .  Having  estimated  PY  consistently,  Theil's  G2SLSM 
estimator  of  the  system  (2)  can  be  obtained  as  follows: 

Write  the  model  presented  in  (2)  as 

y  i  =  Y  t  (5  t  +Y  i  (  .  !  7  i  i  +Xi  7  i  2+u  ,  (7) 

If  the  autocorrelation  coefficient  is  known  to  be  equal  to 
r,  we  can  use  the  Pra i s-Wi ns ten  transformation  on  (7)  to 
obta in53 

53  Note  that  we  shall  use  the  Pra i s-Winsten  transformation 
matrix  only  if  a  consistent  estimate  of  PY  is  obtained 
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Py  i  =  py1/31+py1  (  _  i7i  i  +PX  !  7  i  2  +  Pu  i 
or 

•  •  •  • 

y  1  =  Y  1  /3  !  +  Y  !  (  _  1  7  !  !  +  X  !  7  1  2  +  ei  (8) 

Theil’s  G2SLSM  estimator  of  the  coefficients  of  the 
system  (7)  can  be  obtained  by  replacing  Y,  in  (8)  by  its 
consistent  estimate  obtained  above  and  estimating  the 
resultant  equation  by  OLS;  that  is, 

A  A  A  A 

6i  =  (Z/Z,)-1  zr  y,  (9) 

where 

A  /N 

Z  i  =  (Y1  Y  i  ,  -  -i  X,  ) 

6,  =  ( P  i  '  7 1  i  '  7i2')’ 

Under  the  assumption  of  known  and  equal  autocorrelation 
coefficients,  the  G2SLSM  estimator  (9)  is  equivalent  to  a 
usual  2SLS  estimator  of  the  transformed  system  (8)  and 
therefore  is  consistent  and  efficient  within  the  class  of 
limited  information  estimators.  Its  asymptotoic  covariance 
matrix  is 

a  A  A 

Asy-covlS,)  =  a^plim  T(Z1'Z1)_1  (10) 

However,  new  problems  of  estimation  arise  when  the 
autocorrelation  coefficient  is  unknown  and  has  to  be 
replaced  by  its  consistent  estimate.  A  consistent  estimate 
of  the  autocorrelation  coefficient  can  be  obtained  from  the 
estimated  residuals  of  the  system  (7).  To  estimate  (7),  we 
not  only  have  to  consider  the  correlation  between  Y,  and  u, , 

5 3 ( con t ' d ) through  the  estimation  of  the  ordinary  reduced 
form.  If  the  augmented  reduced  form  is  used  and  therefore 
only  T- 1  fitted  values  for  PY  are  obtained,  we  will  use  the 
CORC  transformation  matrix  which  disregards  the 
transformation  of  the  first  observation. 
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but  also  should  take  into  account  of  the  correlation  between 
Y1(_-,  and  u,  .  This  means  that  at  the  initial  stage  we  should 
use  an  instrumental  variable  estimator  which  replaces  Y ,  and 
Y1(_!  by  proper  instruments,  possibly  from  the  list  of  the 
current  and  lagged  values  of  the  exogenous  variables 
excluded  from  the  structural  equation  (7).  In  other  words, 
if  W  is  the  matrix  of  instruments  so  selected,  the  first 
stage  instrumental  variable  estimator  of  is 

6,v  =  (WT  Z  1  )  '  1 W'  y  i  (11) 

where 

Z i  =  (Y1  Y, , . i  X1 ) 

Under  standard  assumptions  the  estimator  in  (11)  is 

A 

consistent.  Then  6iv  can  be  used  to  obtain  a  consistent 
estimate  of  the  residuals  u1;  and  thus  calculate  a 
consistent  estimate  of  the  autocorrelation  coefficient, 

A  A 

denoted  as  r.  Using  r  we  can  transform  the  structural 
equation  (7)  and  estimate  the  structural  coefficient 
following  the  procedure  outlined  above. 

Following  the  approach  adopted  by  Fair(1970),  we  can 
show  the  consistency  of  Theil's  G2SLSM  estimator  when  it 
uses  a  consistent  estimate  of  the  autocorrelation 
coefficient.  To  do  this  we  can  write  equation  (8)  as54 

A 

y  1  =  Y  ,  0  !  +  Y,  ,  _  7  !  !  +  X,7l2 

A 

+  [ u  ,  ( -  ,  ( r-r  )  +  €  1  +  Vi  0 , ]  (12) 

where 


54  Note  that  for  simplicity  we  shall  ignore  the  transformed 
first  observation  which  is  asymptotically  negligible. 
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A 

v,  =  Y i  -  Y, 

A 

and  Y  t  is  a  consistent  estimate  of  Y,  obtained  from  the 
ordinary  or  augmented  reduced  form.  Theil's  G2SLSM  estimator 
of  the  structural  coefficients  can  be  obtained  by  estimating 
equation  (12)  by  OLS .  Application  of  OLS  to  equation  (12) 
yields  the  estimated  coefficients  which  minimizes  the  sum  of 
squared  residuals.  Therefore,  If  the  three  components  of  the 
error  term  are  orthogonal  to  each  other,  the  sum  of  squared 
residuals  occurs  where  the  square  of  its  each  component  term 
is  at  its  minimum.  The  square  of  the  first  term  reaches  its 

.  .  A 

minimum  when  r=r  which  in  turn  eliminates  u1(_i  from  the 
residual  term  and  ensures  the  consistency.  Therefore, 

A 

Theil's  estimator  is  consistent  only  if  u  t  _ i  and  V,  are 
orthogonal  to  each  other.  This  is  so,  since  if  they  are  not 
orthogonal,  the  minimum  sum  of  squared  residuals  of  (12) 

A 

will  not  occur  where  r  =r.  This  in  turn  leaves  the  u1;_i  in 
the  residuals  term  that  is  correlated  with  Y1(_-i  and 
therefore  renders  Theil's  estimator  inconsistent. 

In  the  case  where  the  augmented  reduced  form  is  used  to 

_A 

obtain  a  consistent  estimate  of  Y,  u1;_i  and  V,  are 
orthogonal  since  u1(--i  is  a  function  of  Y_  ,  ,  Y_2  and  X.  , 

(see  14  below)  and  these  variables  are  used  as  instruments 

A 

in  the  estimation  of  V, .  Therefore , ' Theil ' s  G2SLSM  estimator 
will  be  consistent.  Proof  of  the  consistency  of  G2SLSM  when 
it  uses  the  ordinary  reduced  form  also  requires  the 

A 

orthogonality  of  Ui_i  and  V, .  To  demonstrate  the 

A 

orthogonality  of  u1(_i  and  V1  when  the  consistent  estimate 
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of  Y  is  obtained  through  the  ordinary  reduced  form  equation 
(4),  write  the  system  (1)  as 

Y  B  +  Y. 1T1  +  X  r2  =  U  (13) 

From  (13),  we  have 

U. t  =  Y  _ i B  +  Y.2T1  +  X. i r  2 

=  (Y. 1  Y. 2  X. , )H  (14) 


Also  equation  (5)  can  be  written  as 


Equation  (14)  shows  that  U_ i  is  a  function  of  Y_ i ,  Y_2  and 
X_ i .  Equation  (15)  demonstrates  that  these  variables  are 

A 

also  used  as  instruments  in  obtaining  Y,  and  thus  in 

A  A 

calculating  V1t  Therefore,  by  the  property  of  OLS,  V,  and 
U_ 1  are  orthogonal.  Hence,  Theil's  G2SLSM  estimator  when  it 
utilizes  a  consistent  estimate  of  the  autocorrelation 
coefficient  is  consistent. 

Dhrymes'  Estimators 

Dhrymes,et  al.(1974)  proposed  a  number  of  alternative 
estimators  for  estimation  of  the  model  (2).  These  estimators 
can  be  reduced  to  two  basically  different  methods.  One 
method  is  a  two-step  instrumental  variable  estimator  and 
employs  the  ordinary  reduced  form  to  obtain  predictions  of 
the  current  and  lagged  endogenous  variables  appearing  among 
the  explanatory  variables  of  the  structural  equation  under 
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consideration.  The  second  method  which  they  called  the 
"converging  iterate  two  stage  least  squares  autoregressive 
estimator  (C2SLSA)"  is  a  generalization  of  the  Fair 
estimator  discussed  in  Chapter  II. 

The  two-step  instrumental  variable  estimator  designed 
for  estimation  of  system  (2)  is  constructed  as  follows: 

1-  Write  the  system  under  consideration  as 

YB+Y.1r1+xr2=U  (16) 

Estimate  each  structural  equation  by  an  instrumental 
variable  estimator  and  form  a  consistent  estimate  of  B,  T, , 
and  r2 .  Using  these  consistent  estimates,  form  the  reduced 
form  of  the  system  (22)  and  obtain  consistent  predictions  of 
the  current  dependent  variable  as 

A  A  A 

y i  =  y _ i n i  +  x  n2  (17) 

where 

A  A  A 

n,  =  -r.B-1 

A  A  A 

n  2  =  —  r  2  b 

For  (17)  to  be  operational,  we  need  a  starting  value  for  Y 
at  time  zero,  i.e.,  Y0.  It  is  permissable  to  choose  an 
initial  condition  of  Yo=0.  Therefore,  using  (17)  and  the 

A 

starting  value  for  Y,  we  obtain  fitted  values  of  Y,  and 

A 

their  one  period  lags,  Y ,  ;  . 

2- 0btain  a  consistent  estimate  of  the  autocorrelation 
coefficient  of  the  structural  equation  under  consideration, 
using  a  consistent  estimate  of  the  residuals  calculated  from 
the  instrumental  variable  estimation  at  the  first  step. 

i  t 


Define  W 


Z 1 , y !  and  6 , 
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• 

w, 

A 

=  [  PY  , 

A 

PY,  ( 

- ,  PX , ] 

• 

Zi 

=  [PY, 

PY,  , 

-  ,  PX ,  ] 

• 

y  i 

ii 

=  (0,’ 

7  i  i  ’ 

712')' 

Where  P  is  the  Prais-Winsten  transformation  matrix.  Then, 
the  proposed  two-step  instrumental  variable  estimator  is 

6 ,  =  (Wi ' Z , ) - 'W, ?y ,  (18) 

This  two-step  estimator  is  very  closely  related  to  the 
instrumental  variable  method  proposed  by  Brundy  and 
Jor genson ( 1 97  1  )  for  systems  with  no  autocorrelation.  It  can 
also  be  seen  that  this  two-step  estimator  is  nothing  more 
than  a  method  which  treats  the  lagged  endogenous  variable  as 
an  endogenous  variable  and  replaces  it  with  its  systematic 
part  in  the  estimation. 

The  second  method,  C2SLSA ,  is  constructed  as  follows: 

1-  Obtain  the  augmented  reduced  form  of  the  system  (1) 
as 

y  =  y. ,n , +y_ 2n2+x  n3+x.1n4+w  (19) 

Regress  each  column  of  Y,  on  Y.  ,  ,  Y_2,  X  and  X_  i  and  obtain 
the  predicted  values  of  Y,,  denoted  as  Y, . 55 

2-  Estimate  the  structural  equation  (2)  with  an 
instrumental  variable  estimator  and  obtain  a  consistent 
estimate  of  the  autocorrelation  coefficient  r  using  the 
estimated  residuals  calculated  from  the  instrumental 
variable  estimation. 


55  Note  that  Y,  can  also  be  calculated  from  (19)  by 
replacing  n  j ,  i=1,...,4,  by  their  consistent  estimates. 
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3-  Using  the  estimated  r,  transform  the  structural 
equation  (2)  as 

y  1  ~ry  1,-1  =  (Y^rY,  ,  _  ,  ) /3 ,  +  (X  ,  *-rX  ,  (  _  1*)7l+e  , 
or 

y  1  -?y  1  ,  -  1  =  (YrrY,  ,  -  ,  )/3, 

+  [  (Y,  ,  .  ,-rY,  ,  .  2  )  ,  (XrrX,  ,  .  ,  ]y,+e  ,  (20) 

A 

4-  Define  Z1f  v , ,  and  6,  as 

Zi  =  [  (Y.-rY,  ,  _  !  )  ,  (Y,  ,  .  ,-rY,  ,  _2)  ,  (X^rX,  ,  .  ,  )  ] 

y i  =  y i  -  ?y i ,  - 1 

5i  =  (0i '  7  i  ' ) ' 

Estimate  the  structural  coefficient  as 

y\  A  A  A 

6 1  =  (Z, ? Z  !  ) -  1  Z,  'y !  (21  ) 

5-  Using  61f  recompute  r  from  step  (3)  and  continue 
until  convergence. 

Dhrymes,  et  al.  showed  that  both  the  two-step 
instrumental  variable  and  C2SLSA  estimators  are  consistent. 
They  also  showed  that  the  C2SLSA  estimator  that  uses  the 
augmented  reduced  form  to  obtain  predictions  of  the  current 
dependent  variables  is  more  efficient  than  the  instrumental 
variable  method  which  uses  the  ordinary  reduced  form.  Under 
the  assumption  of  known  coefficient  of  autocorrelation,  the 
asymptotic  covariance  matrix  of  the  C2SLSA  estimator  is 
equal  to 

A  A 

Asy-cov^,)  =  a^plim  T(Z1'Z1)-1  (22) 

However,  when  the  coefficient  of  autocorrelation  is  unknown 
and  is  replaced  by  its  consistent  estimate,  the  C2SLSA 
estimator  becomes  asymptotically  less  efficient.  Derivation 
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of  the  asymptotic  covariance  matrix  of  this  estimator  when 
coefficient  of  autocorrelation  is  unknown  is  given  in 
Dhrymes,  et  al.  ( 1974) . 

Hatanaka' s  Estimators 

Ha tanaka ( 1 976 )  developed  three  alternative  limited 
information  two-step  estimators  for  estimation  of  the  system 
(2).  These  efficient  methods  are  based  on  the  technique 
which  he  developed  for  estimation  of  a  dynamic  single 
equation  model  with  autoregressive  errors  (Hatanaka , 1 974 ) . 

He  called  it  "the  Residual-Adjusted  Aitken  estimator"  since 
it  was  basically  an  Aitken  type  estimator  except  that  it 
included  the  estimated  residuals,  lagged  one  period,  in  the 
list  of  regressors  to  ensure  asymptotic  efficiency. 
Hatanaka's  Residual  Adjusted  Aitken  estimator  is  similar  to 
a  modified  method  of  scoring  for  maximizing  the 
log-likelihood  function  except  that  in  its  derivation  the 
terms  with  zero  probability  limits  were  ignored.  For  finite 
sample  sizes  the  Residual  Adjusted  Aitken  estimator  is  not 
identical  to  the  modified  method  of  scoring.56  However,  this 
estimator  is  identical  to  a  two-step  Gauss-Newton  estimator. 
Only  one  iteration  of  Gauss-Newton,  that  is  a  regression  of 
the  residuals,  Et,  on  ^Et/J)(S),  yields  an  asymptotically 
efficient  estimator  of  the  structural  coefficients.57 

56  Hatanaka  (1974),  p.202. 

57  For  discussion  of  different  methods  of  numerical 
optimization  and  the  equivalence  of  Hatanaka's  estimator  and 
the  two-step  Gauss-Newton  estimator  see  Harvey ( 1 98 1 ) , 

Chapter  4  and  8. 


. 
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The  three  alternative  limited  information  estimators 
which  Hatanaka  proposed  using  the  method  developed  in  the 
Residual  Adjusted  Aitken  estimator  share  the  following  first 
step  in  common: 

1-Estimate  the  structural  equation  by  an  instrumental 
variable  estimator  and  obtain  a  consistent  estimate  of  the 
residuals,  u, .  Then  use  the  estimated  residuals  to  obtain  a 
consistent  estimate  of  the  autocorrelation  coefficient  r, 
denoted  as  r,  using  the  Cochrane-Orcut t  method. 

The  second  step  for  these  three  alternative  methods 
denoted  as  A,  B,  and  C,  is  as  follows 


Method  A: 

Use  the  augmented  reduced  form 


1 

II 

i  n  i  + 

y  -  2  n  2 

+  X 

n3  +  x _  ■( n a  +  w 

(23) 

and  regress 

each 

column 

of 

Y i  on  Y_ i ,  Y _  2 ,  X,  X. , 

to  obtain 

fitted  values  for  Y, ,  denoted  as  Y , .  Define  Z1f  y1f  and  5, 


as 


A 

Z1 
y  i 
Si 


[  (Y^rY,  ,  .  i  )  ,  (Y,  , 

A 

y i  -  ry i ,  - 1 

or  7.')' 


A 

i-rY, 


-  2 


) , ( X , - r  X ,  ,  . , ) ,u,  , - ! 3 


Note  that  the  matrix  Z, 


A 

lagged  residuals  Ui,_i. 


I 


\ 


Si 


A 

r  i 


i  / 


A  A 

(Z, ,Z1 )- 'z, ’ y , 


has  been  augmented  by  the  vector  of 
Then  estimate  6,  and  r,,  as 


(25) 


Finally,  the  estimator  is  defined  as 


This  estimator  can  be  regarded  as  a  simplification  and 
extension  of  the  Sargan(1961)  and  Amemiya ( 1 966 )  methods. 


Method  B: 

The  second  estimator  is  a  simplified  form  of  the  Brundy 
and  Jorgenson  instrumental  variable  method  which  was 
modified  by  Fair(1972)  to  take  into  account  the 
autocorrelation  property  of  the  disturbances.  The  procedure 
followed  in  this  method  is  as  follows: 

1-  Obtain  fitted  values  of  Y,  from  the  reduced  form 
y  =  y .  -i n ,  +  Y_2n2  +  x  ri3  +x_1ri4  (26) 

A  A  A  A 

where  IT,,  n2,  n3  and  FU  are  consistent  estimate  of  the 
reduced  form  coefficients  formed  from  consistent  estimates 
of  the  structural  coefficient  obtained  from  the  first  step 
Instrumental  variable  estimation  applied  to  all  the 


A 


equations.  Define  W, 

and  Z  1 

as 

A  A  A 

w,  =  [ (Y,-rY1 , . 

, ) , (Y, , 

A 

-rrY,, 

-2) , (X,-rX, 

x  A 

-  1  ) ,Ui  , 

-1] 

i 

>-* 

1 

i - 1 

II 

|ISJ 

i ) , (Y, , 

i 

i 

r-t> 

-2), (X 1 -rX ! , 

X  A 

-  1 ) ,Ui  , 

-1] 

Then  estimate  the  structural  coefficients  with  the 
instrumental  variable  method,  namely 


A  _  A  _ 

1  =  (W1'Z1)'1W1'y1 

L  I 

i  i  j 

The  estimator  is  then  defined  by 


(27) 


' 
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.  A 


Si  l 

f 

/ 

/ 

Si 

A 

= 

1  A 

A 

fi  1 

\r 

+ 

r  i  i  / 

(28) 


Method  C: 

To  obtain  the  third  estimator,  define 

—  A  A 

u i  =  u,  -ru i , - ! 


(29) 


Then  estimate  the  structural  coefficients  as 


A  A  A 

(W, ' W, ) ■ 'W, 'u, 


(30) 


A 

\r,i 

Then  the  third  alternative  estimator  is 


/  A  \ 

6  i  +  5  iv  \ 

A 

\r  i  / 

A  A 

\  r  +  r  i  ,  1 

A 

Where  6 iv  is  the  instrumental  variable  estimate  of  the  first 
structural  equation  coefficients  obtained  in  the  first  step. 
This  estimator  can  be  interpreted  as  a  second  step  in  a 
Gauss  Newton  procedure. 5 8 

Hatanaka  showed  that  these  three  two-step  estimators 
are  consistent  and  asymptotically  efficient  within  the  class 
of  limited  information  estimators.  Therefore,  the  choice 
among  them  should  be  made  in  regard  to  their  small  sample 
performance . 


5  8  See  Pagan (  1982  ). 


' 
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Other  Est imators 

Other  limited  information  estimators  have  been  proposed 
for  estimation  of  dynamic  autoregressive  simultaneous 
equation  systems.  In  fact,  a  large  number  of  asymptotically 
efficient  estimators  can  be  defined  using  the  estimator 
generating  equation  in  Hendry ( 1 S7 6 ) .  As  a  particular  case 
Hendry  and  Srba(1977)  proposed  an  instrumental  variable 
estimator  (AIV)  which  minimizes  a  quadratic  form  with 
respect  to  the  structural  coefficients.59  Using  a  two 
equation  dynamic  autoregressive  model  they  compared  the 
small  sample  properties  of  AIV  with  2SLS,  OLS  and  a 
variation  of  OLS  which  corrects  for  the  presence  of 
autocorrelation.  They  found  that  AIV  outperformed  other 
estimators  for  large  samples  (T>55)  and  considerable 
autocorrelation  (r>0.5);  2SLS  was  optimal  for  large  samples 
and  small  autocorrelation  coefficients;  and  OLS  was  best  for 
small  samples  and  low  autocorrelation  coefficients.60 

Wang  and  Fuller  (1982)  also  proposed  a  limited 
information  method  for  estimation  of  dynamic  autoregressive 
models.  Their  estimator  is  an  extension  of  the  two-step 
Gausss-Newton  procedure  which  is  similar  to  the  Hatanaka 
(Method  A)  estimator  but  was  independently  derived  in  Wang 
and  Fuller ( 1975) . 


59  See  Hendry  and  Srba  (1977),  p.971. 

60Note  that  this  estimator  was  not  known  to  us  at  the  time 
of  planning  for  simulation  and  therefore  is  not  used  in  this 
thes i s . 
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C.  Small  Sample  Investigation 

To  investigate  the  small  sample  properties  of  the 
limited  information  estimators  designed  for  the  estimation 
of  the  dynamic  simultaneous  equations  models  with 
autocor related  errors  we  used  the  following  three  equation 
model  which  is  a  modification  of  the  one  used  in  Chapter  5. 
y  b  +  y _ i r !  +  x  r  2  =  u 

U  =  U. i R  +  E 
E  (  E  t  )  =  0 
E  (  E  t  '  E  t  )  =0 

where  Y  is  a  T.3  matrix  of  observation  on  the  endogenous 
variables,  Y_  ■,  is  T.3  matrix  of  observation  on  the  lagged 
endogenous  variables,  X  is  a  T.7  matrix  of  observation  on 
the  exogenous  variables  whose  first  element  at  every 
observation  is  unity,  U  is  the  T.3  observation  matrix  of  the 
disturbances,  R  is  a  3.3  diagonal  matrix  of  the 
autocorrelation  coefficients  and  E  is  T.3  matrix  of  random 
errors.  B,  T,  and  r2  are  the  structural  coefficients.  To 
make  our  results  comparable  with  the  previous  experiments, 
we  used  the  same  structure  of  the  coefficients  of  the 
endogenous  and  the  exogenous  variables  specified  in  Chapter 
5.  The  structure  of  the  coefficients  of  the  lagged 
endogenous  variables,  T,  is:6’ 


6,Given  the  structure  of  the  coefficients,  we  can  show  that 
the  dynamic  system  (1)  is  stable,  i.e.,  the  characteristic 
roots  of  TtB'1  lie  within  a  unit  circle.  For  detailed 
discussion  of  the  stability  conditions  of  such  dynamic 
systems  see  Mur kata ( 1 977 ) . 


0 
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r1  =  o 


0.29  0 


\o 


0  0 


We  used  both  trended  and  nontrended  stochastic  exogenous 
variables.  First,  we  examine  the  relative  performance  of  the 
limited  information  estimators  using  non-trended  stochastic 
exogenous  variables.  These  variables  follow  the  same  process 
defined  in  Chapter  V.  Then,  we  investigate  the  effect  of  the 
trended  data  (to  be  specified  later)  on  the  performance  of 
the  estimators.  The  structure  of  the  variance-covariance 
matrix  of  the  errors,  0,  will  be  the  same  as  in  Chapter  V. 

As  was  the  case  in  the  previous  experiments,  we  first 
undertook  a  pilot  study  which  covered  all  the  possible 
variations  of  the  different  estimators.  Then  we  selected  the 
estimators  which  performed  relatively  better  than  the  others 
for  the  rest  of  the  experiments.  The  pilot  experiment 
undertaken  in  this  part,  will  investigate  the  performance  of 
9  estimators  shown  in  Table  6-1,  when  the  sample  size  is  30 
and  the  autocorrelation  coefficient  of  the  first  structural 
equation  is  equal  to  0.6  while  the  second  and  third 
equations  possess  autocorrelation  of  0.9  and  0.2 
respectively.  To  increase  the  efficiency  of  the  experiment 
without  increasing  the  computational  cost  we  employed  the 
antithetic  variate  method.  Each  complete  experiment 
consisted  of  200  replications,  100  direct  and  100 
antitethic.  For  the  experiments  which  use  the  effective 
sample  size  of  30,  we  generated  50  observations  on  Xt  and 
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U t . 6 2  Using  Xt  and  Ut,  the  Yt  series  were  constructed,  using 
zero  for  the  initial  value  of  Y. , .  Then  the  first  20 
observations  were  rejected  to  make  the  sample  independent  of 
the  starting  values.  For  those  experiments  which  used  the 
effective  sample  size  of  60,  we  generated  100  observations 
on  Xt  and  Ut  and  thus  on  Yt ,  using  the  initial  value  of  zero 
for  Y_ -i  .  Then  the  first  40  observations  were  rejected. 

All  estimators  considered  in  this  part  (except  Fair) 
have  the  first  step,  which  is  an  instrumental  variable 
estimation  of  the  structural  coefficients  under 
consideration,  in  common.  Therefore,  for  the  first  step  IV 
estimation  we  use  the  fitted  values  of  the  endogenous 
variables  estimated  from  the  augmented  reduced  form  as 
instrumental  variables  for  the  current  and  the  lagged 
endogenous  variables.  For  the  Theil  estimator  which  employs 

A  A 

the  ordinary  reduced  form,  we  chose  Y_  i  =  X.ill  as  the 
instrument  for  Y_  ■,  . 

For  a  finite  sample  size,  instrumental  variable 
estimators  might  lack  moments  of  any  order.  This,  as 
Hatanaka  pointed  out63,  is  due  to  the  fact  that  the 
matrix64  (W'Z)  might  not  be  positive  definite  and  therefore 
the  probability  of  the  determinant  of  (W'Z)  being  equal  to 
zero  can  be  non-zero  which  causes  the  integral  that  defines 

62  Note  that  effective  sample  size  "T"  means  having  "T+1" 
observations  since  the  existence  of  lagged  endogenous 
variables  automatically  reduces  the  number  of  observations 
by  one. 

6 3Hatanaka  (1974),  pp. 205-206. 

6 4Note  that  the  standard  IV  estimator  is  in  the  form  of: 
B=(W'Z)"1W'y. 


B 


139 


the  moments  of  the  IV  estimator  to  become  infinite.  Since 
all  estimators  in  this  part  are  based  on  the  first  stage  IV 
estimation,  there  is,  thus,  a  possibility  that  they  also 
lack  moments  of  any  order.65  Therefore,  to  deal  with  this 
problem,  we  shall  accompany  our  standard  summary  statistics 
with  statistics  on  the  median  as  a  measure  of  centre,  and 
the  interdecile  range,  i.e.,  the  distance  between  the 
quantiles  of  order  0.1  and  0.9,  as  a  measure  of 
dispersion . 6  6 

D.  Monte  Carlo  Results  Using  Stochastic  Non-trended 
Exogenous  Variables 

a.  Results  of  the  Pilot  Experiment 

The  pilot  experiment  was  carried  out  using  the 
antithetic  method  to  compare  the  performance  of  the 
estimators  shown  in  Table  6-1.  We  encountered  major  problems 
in  using  three  of  the  estimators,  namely  Theil  which  uses 
the  ordinary  reduced  form,  the  Dhrymes  two-step  estimator 
and  its  iterative  version.  In  many  instances  they  failed  to 
converge  after  20  iterations.  Moreover,  they  often  produced 
unacceptable  resul t s ( i . e . ,  r>1)  when  they  converged.  Also, 

65  Note  that  this  problem  did  not  exist  for  the  models 
without  lagged  endogenous  variables.'  They  were  all  based  on 
a  first  step  2SLS  estimator  whose  first  two  moments  are 
known  to  exist  for  the  system  under  consideration  in  the 
first  part  of  this  thesis.  Moreover,  we  calculated  the 
measures  of  centre  and  location  that  are  not  based  on  the 
existence  of  the  moments  for  the  non-lagged  endogenous 
variable  model  and  we  did  not  find  any  change  in  the  ranking 
of  the  estimators. 

66  These  are  the  summary  statistics  used  by  Hatanaka  (1974), 

p. 206 . 
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in  few  of  the  iterations  for  the  Dhrymes  two-step  estimator 
we  obtained  singular  matrices.  Therefore,  we  omitted  those 
estimators  and  carried  out  the  experiment  with  the  remaining 
six  estimators. 


b.  General  Results  of  the  Monte  Carlo  Experiment 

In  addition  to  these  six  estimators,  we  considered  2SLS 
and  the  true  Theil  estimators.  The  2SLS  method  estimates  the 
structural  coefficients  with  the  restriction  that  the 
autocorrelation  coefficient  is  equal  to  zero.  This, in  fact, 
is  a  mi sspec i f icat ion .  Due  to  the  presence  of  lagged 
endogenous  variables,  this  mi sspec i f icat ion  not  only  leads 
to  inefficient  results,  but  also  gives  inconsistent 
coefficient  estimates.  The  true  Theil  estimator  estimates 
the  structural  coefficients  under  the  assumption  of  known 
autocorrelation  coefficient  and  will  be  used  as  a  base  for 
comparison.  Appendix  V  contains  the  detailed  statistics 
concerning  the  relative  performance  of  the  above  estimators. 
In  what  follows,  however,  we  shall  compare  them  according  to 
their  aggregate  indices. 

1.  Comparisons  using  measures  of  centre 

Table  6-2  presents  the  results  of  the  antithetic 
estimates  of  the  aggregate  biases  of  different  estimators 
for  different  sample  sizes.  For  low  and  medium  degrees  of 
autocorrelation,  2SLS  has  the  lowest  bias.  Its  bias 


TABLE  6-2:  Aggregate  Bias  Calculated  on  the  Base  of  the  Mean* 
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*Note  this  index  does  not  include  the  intercept. 
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increased  sharply  as  the  autocorrelation  coefficient  reached 
0.9.  Theil,  Hatanaka(A)  and  Dhrymes  estimators  have  more  or 
less  the  same  aggregate  bias.  As  was  the  case  in  the 
non-lagged  system,  the  Fair  estimator  has  the  highest 
aggregate  bias.  As  we  discussed  before,  the  index  of  the 
aggregate  bias  does  not  include  the  constant  term.  When  the 
autocorrelation  coefficient  reached  0.9,  all  estimators 
seriously  underestimated  the  constant  term.  This 
underestimation  can  cause  serious  problems  for  prediction 
purposes.  The  aggregate  bias  of  Theil,  Hatanaka(A)  and 
Dhrymes  estimators  decreased  considerably  when  the  sample 
size  was  increased  to  60,  revealing  their  consistency 
property.  The  aggregate  bias  of  2SLS  increased  considerably 
when  the  autocorrelation  coefficient  was  equal  to  0.9  and 
the  sample  size  was  increased  to  60.  However,  this  behaviour 
which  is  in  accordance  with  its  inconsistency  property  only 
shows  up  when  the  autocorrelation  coefficient  is  very  high. 
For  low  and  medium  degrees  of  autocorrelation  2SLS  had  the 
lowest  bias.  Generally,  the  Theil,  Hatanaka(A)  and  Dhrymes 
estimators  performed  best  when  the  sample  size  increased  and 
the  autocorrelation  coefficient  was  high.  The  above  picture 
remains  more  or  less  the  same  when  we  use  median  rather  than 
the  mean  as  a  measure  of  location  (Tables  6-3).  This 
cor respondance  of  median  and  mean  suggests  that  distribution 
of  these  estimators  is  symmetric. 


2.  Comparison  using  Normalized  Determinant (MSE )  Criterion 
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*Note  this  index  does  not  include  the  intercept. 
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A  more  revealing  picture  of  the  performance  of 
different  estimators  can  be  observed  by  examining  Table  6-4 
in  which  we  have  normalized  the  det(MSE)  of  the  estimators 
by  dividing  it  by  the  corresponding  det(MSE)  of  the  true 
Theil  estimator.  Using  the  normalized  det(MSE)  as  an 
aggregate  measure  of  relative  efficiency  of  different 
estimators,  we  observed  that  for  sample  size  of  30  the 
Hatanaka(A)  estimator  performed  best  for  low  degrees  of 
autocorrelation.  Its  performance  deteriorated  marginally 
when  the  autocorrelation  coefficient  was  increased  to  0.6 
and  became  inferior  to  the  Dhrymes  and  Theil  estimators  for 
high  degrees  of  autocorrelation.  Dhrymes' s  estimator  ranks 
best  for  medium  and  high  degrees  of  autocorrelation. 
Generally,  three  of  the  estimators,  Dhrymes,  Theil  and 
Hatanaka(A)  performed  considerably  better  than  all  other 
estimators  for  all  degrees  of  autocorrelation  when  the 
sample  size  was  equal  to  30.  The  performance  of  2SLS 
worsened  as  the  autocorrelation  coefficient  increased.  The 
Fair  estimator  performed  poorly  relative  to  others.  This 
picture  remained  more  or  less  the  same  when  the  sample  size 
was  increased  to  60.  In  this  case,  the  relative  position  of 
the  Theil  and  Hatanaka(A)  estimator  changed  slightly;  the 
Theil  estimator  performed  best  for  low  degree  of 
autocorrelation.  Dhrymes'  estimator  performed  best  for 
medium  and  high  degrees  of  autocorrelation.  The  Hatanaka(A) 
estimator  performed  poorly  relative  to  Dhrymes  and  Theil 
when  the  autocorrelation  coefficient  reached  0.9. 


. 


TABLE  6-4:  Normalized  Determ i nant ( MSF ) * 
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*Note  this  index  does  not  include  the  intercept. 
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3.  Comparison  Using  Trace(MSE)  as  a  Criterion 

A  different  picture  appears  when  we  compare  estimators 
on  the  basis  of  the  normalized  trace(MSE)  (Table  6-5).  For 
sample  size  30,  2SLS  dominated  all  other  estimators  for  low 
and  medium  degrees  of  autocorrelation.  The  Hatanaka(C) 
estimator  also  performed  relatively  well  for  low  degrees  of 
autocorrelation.  The  relative  performance  of  2SLS  and 
Hatanaka(C)  deteriorated  as  the  autocorrelation  coefficient 
reached  0.9.  For  high  degree  of  autocorrelation  Theil, 
Dhrymes  and  Hatanaka(A)  estimators  performed  best.  The 
relative  ranking  of  the  estimators  changed  slightly  when  the 
sample  size  increased  to  60.  2SLS  ranked  best  only  for  low 
degree  of  autocorrelation.  Dhrymes,  Theil  and  Hatanaka(A) 
estimators  performed  best  for  medium  and  high  degrees  of 
autocorrelation. 

As  we  mentioned  before,  comparison  of  estimators  on  the 
basis  of  the  normalized  det(MSE)  and  trace(MSE)  implicitly 
assumes  the  existence  of  the  small  sample  moments.  However, 
the  estimators  that  are  based  on  the  first  step  instrumental 
variable  estimation  might  not  possess  small  sample  moments. 
All  the  estimators,  except  2SLS,  considered  in  this  part  are 
on  the  basis  of  first  step  instrumental  variable  estimation. 
This  might  explain  the  relative  superiority  of  2SLS  for  low 
and  medium  degrees  of  autocorrelation  when  the  sample  size 
was  equal  to  30  and  comparison  was  made  on  the  basis  of 
normalized  trace(MSE). 


TABLE  6-5:  Normalized  Trace(MSE)* 
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*Note  this  index  does  not  include  the  intercept. 
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4.  Comparison  Using  Interdecile  Range  as  a  Criterion 

Table  6-6  shows  the  normalized  trace  of  the  interdecile 
range  of  different  estimators  for  sample  sizes  of  30  and  60. 
Analogously  to  the  normalized  trace(MSE),  we  have  normalized 
the  sum  of  the  interdecile  range  across  coefficients  for 
each  estimator.  To  make  the  results  comparable  to  the 
trace(MSE),  we  have  omitted  the  intercept  term  from  the 
calculation  of  this  index.  Using  this  index  as  a  criterion 
of  evaluation  we  can  observe  that  for  a  low  degree  of 
autocorrelation  Hatanaka(A)  performed  marginally  best  while 
for  medium  and  high  degrees  of  autocorrelation  the  Theil  and 
Dhrymes  estimators  outperformed  all  others.  The  performance 
of  the  Hatanaka(A)  estimator  deteriorated  marginally  as  the 
autocorrelation  coefficient  increased.  The  2SLS  estimator 
did  not  perform  well  even  for  low  degrees  of 
autocorrelation.  Its  relative  performance  deteriorated  as 
the  autocorrelation  coefficient  increased.  This  behavior  is 
in  accordance  with  its  inconsistency  and  inefficiency 
properties.  Hatanaka(C)  performed  relatively  well  and  ranked 
fourth  for  medium  and  high  degrees  of  autocorrelation  when 
the  sample  size  was  30.  The  above  picture  did  not  change 
when  the  sample  size  was  increased  to  60. 

It  is  noteworthy  that  the  above'  ranking  of  the 
estimators  is  more  or  less  the  same  as  the  one  that  emerged 
using  the  normalized  det(MSE). 

To  summarize  this  section,  the  ranking  of  estimators 
depends  on  the  criteria  that  one  uses  to  measure  their 


TABLE  6-6:  Normalized  Trace( Interdec i 1 e  Range)* 
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relative  efficiency.  The  ranking  made  using  the  interdecile 
range  is  theoretically  more  sound  since  employment  of  the 
measures  that  are  based  on  the  existence  of  small  sample 
moments  can  be  misleading  in  the  present  case.  Therefore,  we 
can  conclude  that  the  Theil,  Dhrymes  and  Hatanaka(A) 
estimators  performed  best  among  all  the  potential  estimators 
for  the  dynamic  simultaneous  equation  models  with 
autocor related  errors.  There  exist  a  significant  gain  of 
efficiency  in  using  these  methods  which  take  account  of 
autocorrelation  and  the  lagged  endogenous  variables,  even 
for  very  low  degrees  of  autocorrelation.  2SLS  performed 
uniformly  poorly  relative  to  these  estimators  even  when  the 
autocorrelation  coefficient  was  equal  to  0.2. 

An  important  observation  is  that  the  Fair  estimator 
which  is  the  most  commonly  known  method  for  estimation  of 
dynamic  autoregressive  models  and  is  incorporated  in  the  TSP 
package,  was  inferior  to  the  Theil,  Hatanaka(A)  and  Dhrymes 
estimators.  Therefore,  the  above  results  suggest  a  potential 
danger  in  using  the  Fair  estimator  for  estimation  of  the 
dynamic  autoregressive  models.  Note  also  that  Hatanaka(B) 
which  is  based  on  a  Brundy  and  Jorgenson  method  generally 
does  poorly  which  is  in  agreement  with  the  results  in  the 


static  model. 


- 
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E.  Monte  Carlo  Results  Using  Stochastic  Trended  Exogenous 
Var i ables 

To  examine  the  effect  of  trended  data  on  the 
performance  of  dynamic  autoregressive  estimators  we  used  the 
following  exogenous  variables 


X,  =  e°  • 

1  5  t  _L 

+  w  ! 

w  i  / —  N  (  0  , 

0.0049) 

X2  =  e°  • 

1  2  t  , 

+  W  2 

w  2  ~N(0, 

0 . 0025) 

X3  =  e° 

iot  .  IT 
+  W  3 

w  3  N  (  0  , 

0.0009) 

Xu,  X 5  and  X6  are  defined  to  be  stochastic  non-trended 
as  in  Chapter  5. 

To  minimize  the  cost,  we  only  conducted  experiments 
using  a  sample  size  of  30  and  autocorrelation  coefficient  of 
0.6.  We  omitted  the  Hatanaka(B)  and  Hatanaka(C)  estimators, 
since  they  performed  poorly  in  the  previous  case.  The  reason 
for  introducing  the  Fair  estimator  is  for  the  purpose  of 
making  our  results  comparable  to  the  ones  obtained  in  the 
first  part  of  this  study. 

1. Comparison  Using  Measures  of  Centre 

Table  6-7  shows  the  results  of  the  antithetic  estimates 
of  the  biases  of  different  estimators  for  the  trended  data. 
As  can  be  seen  the  2SLS  has  the  lowest  aggregate  bias 
followed  by  the  Dhrymes  and  Theil  estimators.  In  estimation 
of  the  autocorrelation  coefficient,  the  Theil  estimator  has 
the  lowest  bias  followed  by  the  Dhrymes  estimator.  The  Fair 
estimator  has  the  highest  aggregate  bias.  In  estimation  of 
the  constant  term  which  is  not  includedin  the  index  of 


152 


CD 

0 

CO 

CM 

^r 

00 

T— 

CD 

CO 

sz 

+-* 

_ 1 

CD 

CM 

CM 

■*— 

T- 

CD 

< 

IT) 

4-' 

03 

uo 

tn 

Q 

CM 

CM 

O 

o 

o 

o 

O 

z 

O 

+j  -r- 

1 

1 

1 

1 

1 

ro 

"0 

sz  s. 

0 

-H  CD 

Tj 

■h 

C 

cd  a 

0 

■h  ro 

£_ 

0 

o  sz 

K 

0 

2  U 

E 

^r 

ID 

CD 

CD 

CD 

LD 

LD 

o 

D) 

> 

c- 

CM 

CM 

O 

T“ 

to 

■  tn 

C 

£_ 

E  — 

r 

CM 

O 

o 

O 

o 

O 

o 

o 

c_  r 

IT) 

o 

1 

1 

1 

i 

1 

Q) 

ZD 

■H 

c 

C  ^ 

-H  -r- 

03  O 

a 

0  CD 

0)  c 

^  II 

/ — - 

u  o 

t- 

< 

L.  — 

_ 

0 

W' 

(D  +-■ 

3 

r  - 

0 

+-■  ro 

-H  CD 

DC 

r- 

CD 

00 

CD 

o 

LD 

CD 

CO 

c  i_ 

+ 

0 

in 

CM 

▼- 

CM 

T- 

^r 

CM 

CO 

—  0) 

M-  O 

c 

V 

CJW 

0  II 

0 

CM 

O 

O 

o 

o 

O 

O 

o 

(D  - 

£_ 

+-» 

i 

1 

1 

i 

1 

sz  tn 

X 

0  W 

0 

■H  C 

*<~ 

I 

0 

+ 

0 

CD  U 

03 

T3 

CJ 

CO 

3  £. 

-  CD 

X 

0 

0  "0 

-C 

C  C 

+ 

+-* 

c_ 

<x> 

^r 

CM 

ID 

o 

LD 

CM 

--  3 

•r~ 

CO 

CD 

CM 

CM 

LD 

■'sT 

CO 

aP 

c 

0 

CD 

+4  C 

— 

0 

LL 

CM 

o 

O 

O 

o 

O 

O 

0  0 

i 

T- 

i 

1 

1 

1 

1 

o 

c  — 

TO 

1 

+J 

0 

tn  cc 

+ 

•H 

CD  3 

0 

0  O' 

CO 

f— 

"D  CD 

3 

r- 

U 

-»- 

0) 

tn 

CD 

CD 

o 

CD 

CM 

T- 

X  — 

<— 

0 

CD 

CM 

T- 

CM 

T- 

T- 

to 

0)  ro 

+ 

0 

JZ 

■d  L. 

— 

o 

h- 

CM 

O 

o 

o 

o 

o 

o 

o 

C  3 

00 

1 

1 

1 

1 

1 

1 

w 

u 

0 

in  3 

-r— 

t- 

II 

CD 

SZ  -H 

k  in 

-H 

* 

c 

* 

* 

1 

0 

Q) 

CD 

•r- 

-P 

0 

__ 

«*\ 

** 

w  in 

LLI 

**- 

O 

0Q 

00 

00 

a 

a 

£_ 

U)  ra 

_l 

M- 

CD  — 

CO 

4- 

L.  CO 

< 

0 

O) 

K 

0 

O) 

u 

< 

153 


aggregate  bias,  the  Fair  estimator  performed  worst.  It 
seriously  underestimates  the  intercept  term.  The  rest  of  the 
estimators  had  more  or  less  the  same  absolute  bias  in  the 
estimation  of  the  intercept  term. 

The  above  picture  does  not  change  when  we  consider  the 
median  rather  than  the  mean  as  a  measure  of  location  (Table 
6-8).  The  biases  of  the  estimators  stayed  more  or  less 
unchanged.  This  correspondence  of  the  mean  and  median 
reflects  the  symmetric  property  of  these  estimators. 

The  above  pattern  of  the  aggregate  bias  of  the 
estimators  using  trended  data  is  more  or  less  the  same  as 
the  one  that  emerged  using  non-trended  data  (Tables  6-2, 
6-3).  In  other  words,  the  existence  of  trend  in  the 
exogenous  variables  did  not  significantly  change  the 
relative  performance  of  the  estimators. 

2.  Comparison  Using  Normalized  Determinant (MSE )  Criterion 

Table  6-9  presents  the  normalized  MSE  and  det(MSE)  of 
different  estimators  for  trended  data.  Ranking  the 
estimators  on  the  basis  of  their  normalized  det(MSE),  we  can 
observe  that  Hatanaka's  Residual  Adjusted  estimator 
performed  best.  The  Dhrymes  and  Theil  estimators  ranked 
second  and  third.  The  2SLS  estimator  was  outperformed  by  the 
above  three  estimators.  The  Fair  estimator  performed  poorly. 
Its  performance  in  this  case  is  in  contrast  to  its 
performance  in  the  static  model.  In  other  words,  Fair's 
performance  in  the  model  with  lagged  endogenous  variables, 


TABLE  6-8:  Bias  Calculated 
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TABLE  6-9:  Normalized  Root  Mean  Squares  Error  Using  Trended  Data 

( r  =0 . 6 ,  T  =  30) 
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for  which  it  was  specifically  designed,  was  not  as  strong  as 
in  the  static  case  when  the  variables  were  trended. 

The  ranking  of  the  Hatanaka  and  Theil  estimators  has 
changed  slightly  due  to  the  existence  of  trend  in  the 
exogenous  variables.  However,  the  general  picture  is  more  or 
less  the  same  as  the  one  obtained  using  stochastic 
non-trendea  exogenous  variables. 

3.  Comparison  Using  Trace(MSE)  as  a  Criterion 

The  last  row  of  Table  6-9  gives  the  normalized 
trace(MSE)  for  different  estimators.  Using  this  index  as  a 
criteria  of  evaluation,  we  can  see  that  the  relative  ranking 
of  Hatanaka  and  Dhrymes  estimators  changed  compared  to  the 
one  obtained  using  GMSE.  However,  the  general  pattern  of  the 
relative  performance  stayed  unchanged.  The  Dhrymes  , 

Hatanaka  and  Theil  estimators  outperformed  all  others. 

In  the  case  of  stochastic  non-trended  exogenous 
variable  (Table  6-5)  we  observed  that  the  ranking  of  the 
estimators  changed  when  we  used  trace(MSE)  instead  of  the 
normalized  det(MSE).  In  that  case  the  2SLS  performed  best  on 
the  basis  of  the  normalized  trace(MSE).  However,  in  the 
present  case  of  stochastic  trended  data,  the  relative 
ranking  of  the  estimators  did  not  change.  In  other  words, 
the  performance  of  the  2SLS  estimator  was  not  affected  by 
the  measure  of  goodness  used.  This  might  suggest  that  the 
existence  of  trend  in  the  exogenous  variables  has  a  negative 
effect  on  the  performance  of  the  2SLS,  judged  on  the  basis 
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of  the  normalized  trace(MSE). 

4.  Comparison  Using  Interdecile  Range  as  a  Criteria 

Table  6-10  shows  the  normalized  interdecile  range  of 
different  estimators  using  trended  data.  The  last  row  of 
this  Table  shows  the  normalized  trace  of  the  interdecile 
ranges.  As  we  mentioned  before,  this  index  does  not  include 
the  intercept  term. 

Comparing  estimators  on  the  basis  of  their  trace 
interdecile  range  gives  the  same  ranking  as  the  one  obtained 
using  trace(MSE).  This  relative  ranking  of  the  estimators 
using  trended  exogenous  variables  is  the  same  as  the  one 
emerged  using  stochastic  non-tranded  variables. 

To  summarize,  the  ranking  of  the  estimators  did  not 
significantly  change  due  to  the  existence  of  trend  in  the 
exogenous  variables.  This  observation  is  in  line  with  the 
results  obtained  in  the  first  part  using  models  without 
lagged  endogenous  variables. 

F.  Performance  of  Estimators  in  Hypothesis  Testing 

Detailed  statistics  concerning  the  Type  I  error  and  the 
power  of  the  tests  are  given  in  Appendix  VII.  In  this 
section  we  present  the  summary  indices  used  in  Chapter  V. 
Table  6-11  and  6-12  show  the  average  Type  I  error  and  the 
the  average  power  of  the  test  of  significance  of  different 
estimators  for  different  sample  sizes  and  autocorrelation 
coefficients  when  the  exogenous  variables  are  stochastic 


■  s.  1 
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TABLE  6-10:  Normalized  Interdecile  Range  Using  Trended  Data 

( r =0 . 6 ,  T=30) 
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non- t rended . 6 7  The  figures  in  paranthesis  show  the  range  of 
the  corresponding  statistics  over  the  structural 
coefficients.  These  tables  show  that  for  the  sample  size  of 
30  the  observed  level  of  Type  I  error  for  all  estimators  is 
much  greater  than  the  nominal  level  of  significance  for  all 
degrees  of  autocorrelation.  This  problem  was  specifically 
clear  in  the  case  of  Theil,  Hatanaka(A)  and  Dhrymes' 
estimators  for  low  and  moderate  autocorrelation 
coefficients.  When  the  sample  size  was  increased  to  60  the 
observed  Type  I  error  of  most  of  the  estimators  decreased. 
However,  it  was  still  much  larger  than  the  nominal  level  of 
significance.  Table  6-12  shows  the  power  of  the  test  of 
significance  for  different  estimators.  All  estimators 
performed  poorly.  These  results  resemble  the  ones  obtained 
by  Park  and  Mi tchell ( 1 98 1 )  in  the  single  equation  context. 
They  attributed  the  high  percentage  of  Type  I  errors  to  the 
underestimation  of  the  standard  errors.  The  above  picture 
remained  more  or  less  unchanged  when  we  used  stochastic 
trended  exogenous  variables  (Tables  6-13,  6-14). 

The  above  results  lead  us  to  the  same  conclusion 
reached  by  Park  and  Mitchell  (1981,  p.199)  that  "  distrust 
the  conventional  t-sta t i st ic s ....  Because  estimated 
coefficients  seem  much  more  significant  than  they  really 
are,  apply  a  more  stringent  confidence  level  for  hypothesis 
testing." 

67  It  must  be  noted  that  the  Hatanaka(B)  estimator  was 
excluded  from  this  section  since,  in  number  of  instances,  it 
gave  rise  to  negative  variances  that  were  not  acceptable. 


. 


TABLE  6-13:  Percentage  Type 
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These  results  suggest  that  the  test  of  hypothesis  can 
be  a  serious  problem  for  the  dynamic  autoregressive  models. 
In  Chapter  V  we  saw  that  some  of  the  estimators  performed 
well  in  hypothesis  testing  when  the  model  did  not  include 
lagged  endogenous  variables.  However,  the  results  of  this 
section  shows  that  none  of  the  estimators  can  be  reliable  in 
test  of  hypothesis  in  the  dynamic  autoregressive  models. 
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VII.  SUMMARY  AND  CONCLUSION 


Time  series  data  often  generates  disturbances  which  are 
time  dependent.  The  simplest  form  of  this  time  dependence  is 
a  first  order  autocorrelation.  Therefore,  it  is  of  great 
interest  for  the  econometricians  working  with  time  series 
data  to  inquire  into  the  properties  of  methods  appropriate 
for  the  estimation  of  models  characterized  by 
autocorrelation. 

Numerous  techniques  have  been  proposed  for  estimation 
of  first  order  autocor related  models  in  simultaneous 
equation  systems.  These  methods  can  be  distinguished  by 
whether  they  use  T  or  T- 1  observations,  the  reduced  form 
they  employ  and  the  way  they  estimate  the  autocorrelation 
coefficient.  Most  of  the  proposed  methods  are  asymptotically 
efficient  but  nothing  is  known  about  their  small  sample 
properties.  This  thesis  was  an  atempt  to  fill  this  gap  using 
Monte  Carlo  methods. 

We  chose  the  three  equation  model  used  by 
Cragg (  1  967 ,  1  968 )  for  our  experiments.  The  choice  of  a  three 
equation  model  was  on  the  basis  of  the  Mosbaek ' s ( 1 970 ) 
findings  suggesting  that  three  equation  models  reveal  the 
essential  properties  of  larger  models  whereas  two  equation 
models  do  not.  To  increase  the  efficiency  of  our  experiments 
we  chose  the  antithetic  variate  method  which  is  more 
efficient  than  the  conventional  direct  simulation  method. 

Our  pilot  experiment  showed  that  on  average  we  achieved  a 
gain  in  efficiency  of  about  20  percent.  Each  complete 
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experiment  consisted  of  200  replications,  100  direct  and  100 
antithetic.  We  concentrated  on  the  limited  information 
methods  for  reasons  of  cost  and  relevance. 

Our  investigation  consisted  of  two  major  parts.  The 
first  part  examined  the  small  sample  properties  of  the 
estimators  designed  for  estimation  of  the  autocor related 
models  without  lagged  endogenous  variables.  The  second  part 
studied  the  properties  of  estimators  appropriate  for 
estimation  of  dynamic  autoregressive  models. 

Findings  from  the  Static  Model 

In  the  first  part,  we  considered  almost  all  the  limited 
information  methods  proposed  in  the  literature.  We  also 
suggested  new  estimators  and  different  ways  of  interpreting 
or  deriving  the  existing  methods.  To  minimize  the 
probability  of  exclusion  of  any  potentially  good  estimator, 
we  first  conducted  a  pilot  experiment  and  then  chose  the 
estimators  that  performed  relatively  well  for  the  subsequent 
experiments.  We  conducted  experiments  with  both  trended  and 
non-trended  data. 

Recently  Maeshiro( 1 976 , 1 979 , 1 980 )  and  Park  and 
Mi tchell ( 1 980 )  compared  the  relative  efficiency  of  single 
equation  estimators  that  use  all  the  observations  with  those 
that  omit  the  first  observation.  They  found  that  the 
estimators  which  omit  the  first  observation  are  inferior  to 
the  ones  which  use  all  observations,  especially  when  the 
data  was  trended.  Following  them,  we  compared  the  relative 
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efficiency  of  the  limited  information  estimators  that  use  T 
or  T- 1  observations.  We  found  that  employment  of  an  extra 
observation  had  some  positive  effect  on  the  small  sample 
performance  of  estimators,  especially  when  the  exogenous 
variables  were  trended.  However,  the  difference  was  not  as 
great  as  was  suggested  by  Maeshi ro ( 1 976 , 1 879 )  and  Park  and 
Mi tchell ( 1 980 ) .  To  investigate  the  apparent  discrepancy  of 
the  results,  we  undertook  a  Monte  Carlo  experiment 
replicating  and  extending  their  investigation.  We  found  that 
the  source  of  the  discrepancy  was  the  restricted  nature  of 
the  model  employed  by  them. 

Comparison  of  the  methods  employing  different  reduced 
form  showed  that  the  augmented  reduced  form  produced  more 
efficient  estimates.  We  also  found  that  the  Pra i s-Winsten 
method  for  estimation  of  the  autocorrelation  coefficient 
always  leads  to  the  estimate  that  has  lower  bias  and  RMSE. 

It  should  be  noted  that  the  Theil  estimator  had  the  lowest 
bias  in  estimation  of  the  autocorrelation  coefficient,  even 
when  it  was  using  the  CORC  formula,  i.e.,  in  the  dynamic 
autoregressive  case. 

A  more  important  finding  of  this  study  concerns  with 
relative  performance  of  different  methods.  In  practice,  the 
Fair  estimator  is  the  most  commonly  known  method  for 
estimation  of  the  models  characterized  by  autocorrelation. 
Our  results  showed  that  (except  for  some  cases  when  the 
exogenous  variables  were  trended)  this  estimator  was  highly 
inefficient  relative  to  alternatives  we  considered.  In  some 
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instances,  it  was  even  inferior  to  the  2SLS  which  ignores 
autocorrelation.  We  also  found  that  the  small  sample  bias  of 
2SLS  estimator  was  positively  related  to  the  degree  of 
autocorrelation. 

The  ranking  of  different  estimators  turned  out  to  be 
slightly  dependent  on  the  criteria  of  evaluation  used.  For  a 
low  autocorrelation  coefficient  of  0.2,  2SLS  outperformed 
all  other  estimators  on  the  basis  of  generalized  MSE,  while 
it  was  not  best  if  we  had  used  trace(MSE)  as  a  measure  of 
goodness.  Fair  and  modified  Brundy  and  Jorgenson  along  with 
the  Amemiya  and  generalized  limited  information  maximum 
likelihood  estimators  performed  poorly.  Generally,  the  Theil 
estimator  that  has  been  completely  ignored  in  empirical 
research  turned  out  to  be  the  best  method  for  autocor related 
models  without  lagged  endogenous  variables. 

At  this  point  we  should  point  out  that  in  large  models 
we  may  not  be  able  to  use  Theil' s  estimator  due  to  the 
degrees  of  freedom  problem.  In  fact,  the  Fair  estimator  is 
specifically  designed  to  solve  this  problem  by  using  fewer 
instruments  than  estimators  such  as  Theil.  However,  its  weak 
performance  cast  some  doubt  on  its  usefulness.  As  an 
alternative  one  may  use  some  variation  of  Theil' s  estimator 
which  uses,  for  instance,  a  subset  of  the  principal 
components  of  the  exogenous  variables. 

Following  Park  and  Mitchell  (1981),  we  examined  the 
performance  of  different  estimators  in  the  test  of 
hypothesis.  Following  them  we  focused  on  the  number  of  times 
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each  method  leads  to  the  occurrence  of  a  Type  I  error  at 
0.05  level  of  significance.  We  found  that  perhaps  the  main 
problem  with  employment  of  2SLS  and  Fair  estimators,  in  the 
case  of  static  models  and  the  sample  size  of  30,  is  that 
they  lead  to  a  much  greater  Type  I  error  than  the  nominal 
level  of  significance  suggests  when  the  autocorrelation 
coefficient  is  moderate  or  high.  This  fact  suggests  that  the 
tail  of  their  test  statistic  distributions  are  much  thicker 
than  those  given  by  the  t-distribut ion .  The  Theil  estimator 
outperformed  all  other  estimators  for  moderate  and  high 
autocorrelation  coefficients.  As  the  sample  size  increased 
to  60,  all  estimators  experienced  a  significant  reduction  in 
their  actual  Type  I  errors.  In  fact,  their  observed  Type  I 
errors  became  very  close  to  the  nominal  level  of 
signi f icance . 

Generally,  our  results  lead  us  to  the  conclusion  that 
for  very  low  autocorrelation  coefficients,  2SLS  is  as  good 
as  any  other  estimator  in  static  models  with  autocor related 
disturbances.  However,  as  the  autocorrelation  coefficient 
increases  the  performance  of  2SLS  deteriorates.  The  Theil 
estimator  and  its  modified  version  perform  best  for  moderate 
and  high  autocorrelation  coefficients  when  the  model  does 
not  include  lagged  endogenous  variables. 

Findings  from  the  Dynamic  Model 

The  second  part  of  this  thesis  dealt  with  dynamic 
autoregressive  models.  Very  little  is  known  about  the  small 
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sample  performance  of  the  methods  appropriate  for  estimation 
of  these  kinds  of  models.  We  considered  almost  all  the 
existing  methods.  We  also  extended  the  Theil  estimator  to 
take  into  account  lagged  endogenous  variables.  Apart  from 
the  conventional  criteria  of  goodness,  we  also  provided 
information  on  the  median  and  interdecile  range  for  all 
estimators. 

As  was  the  case  in  the  static  models,  the  ranking  of 
the  estimators  was  sensitive  to  the  criteria  of  evaluation 
used.  Focusing  on  the  interdecile  range  for  theoretical 
reasons,  we  found  that  Theil's,  Dhrymes'  and  Hatanaka’s 
Residual  Adjusted  estimators  performed  best.  The  performance 
of  Hatanaka's  Residual  Adjusted  estimator  deteriorated 
slightly  as  the  autocorrelation  coefficient  increased.  Two 
stage  least  squares  did  not  perform  well  even  for  low 
degrees  of  autocorrelation.  Its  performance,  however, 
deteriorated  as  the  autocorrelation  coefficient  increased. 
This  behavior  is  consistent  with  its  asymptotic  property. 
Fair  along  with  the  other  two  Hatanaka  estimators  did  not 
perform  well.  Our  investigation  showed  that  the  Brundy  and 
Jorgenson  type  estimators  performed  relatively  poorly.  In 
fact,  we  found  that  none  of  the  estimators  that  calculated 
the  fitted  values  of  the  endogenous  'variables,  using 
consistent  estimates  of  the  reduced  form  coefficient 
performed  well. 

The  effect  of  trended  data  on  the  performance  of 
dynamic  autoregressive  estimators  was  examined  using 
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stochastic  trended  exogenous  variables.  We  found  that  in 
general  the  ranking  of  the  estimators  did  not  change  and 
stayed  the  same  as  the  one  emerged  using  stochastic 
non-trended  data. 

We  saw  in  the  first  part  that  the  Fair  estimator  is 
asymptotically  less  efficient  than  the  Theil  estimator.  One 
of  the  reason  for  its  use  is  the  argument  that  using  fewer 
instruments  at  the  initial  stage  might  improve  its  small 
sample  performance.  However,  our  results  showed  the 
contrary.  We  also  found  that  2SLS  which  ignores 
autocorrelation  generally  performed  poorly  relative  to  other 
estimators.  These  results  endorse  Hendry's  suggestion  that 
the  asymptotic  properties  can  be  used  as  guide  to  the  small 
sample  performance  of  different  estimators.  However,  it 
should  also  be  noted  that  ALIML  which  was  shown  to  be 
asymptotically  more  efficient  than  the  Theil's  estimator 
when  the  autocorrelation  coefficients  across  equations  were 
not  equal,  did  not  perform  well  in  small  samples.  This  fact, 
in  turn  supports  the  Maasoumi  and  Phillips ( 1 980 )' s  argument 
that  the  discrepancies  between  asymptotic  and  finite  sample 
behavior  of  the  estimators  are  parameter  dependent  and  we 
cannot,  without  qualification,  postulate  the  asymptotic 
behavior  of  the  estimators  from  the  small  sample  results. 

The  second  part  also  examined  the  performance  of 
different  estimators  in  the  test  of  hypothses.  We  found  that 
for  the  dynamic  autoregressive  model,  none  of  the  estimators 
performed  well.  In  fact  all  of  them  showed  much  greater  Type 
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I  error  than  the  nominal  level  of  significance.  This  result 
is  in  line  with  the  Park  and  Mitchell's  result  in  the  single 
equation  context  and  therefore  leads  us  to  the  same  type  of 
conclusion  that  we  can  not  trust  the  conventional 
t-statistics  in  the  dynamic  autoregressive  models.  Moreover, 
since  the  estimated  coefficients  seem  to  be  much  more 
significant  that  they  really  are,  we  should  apply  a  much 
more  stringent  confidence  level  for  hypothesis  testing. 

To  summarize,  our  study  supports  the  Rao  and 
Gr i 1 iches ( 1 969 )  findings  in  the  single  equation  context  that 
"there  is  a  significant  gain  in  efficiency  to  be  had  from 
using  two  stage  estimation  procedures  for  moderate  and  high 
level  of  serial  correlation  in  the  residuals  (r>0.3)  and 
very  little  loss  using  such  methods  even  when  the  true  r  is 
small".  We  also  found  that  there  is  some  gain  in  efficiency 
for  those  estimators  that  use  all  T  rather  than  T- 1 
observa t i ons . 

The  results  of  this  study  can  clearly  be  of  value  in 
guiding  econometric  practitioners  in  their  choice  of 
technique  for  estimation  of  models  character ized  by 
autocorrelation.  However,  our  results  are  subject  to  the 
limitations  inherent  in  all  Monte  Carlo  studies.  One  of  the 
ways  that  this  study  can  be  extended'  is  the  incorporation  of 
Sargan's  2SLS  estimator  that  is  asymptotically  equivalent  to 
ALIML .  The  main  reason  for  the  omission  of  this  estimator 
was  that  the  results  of  the  efficiency  of  ALIML  relative  to 
Theil's  estimator  became  known  to  us  when  all  the 
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experiments  were  done.  Moreover,  we  showed  that  ALIML,  GLIML 
and  Theil's  G2SLS  estimators  are  equally  efficient  when  the 
autocorrelation  coefficients  are  equal  across  equations. 
Therefore,  it  would  be  interesting  to  compare  these 
estimators  in  an  experiment  in  which  the  autocorrelation 
coefficients  are  equal.  It  is  conceivable,  however,  that  the 
Theil  estimator  which  outperformed  other  estimators  when  the 
autocorrelation  coefficients  were  different  across  equations 
would  continue  to  dominate  when  the  autocorrelation 
coefficients  are  equal.  This  is  so  since  the  asymptotic 
superiority  of  ALIML  was  obtained  when  the  autocorrelation 
coefficients  were  not  equal  across  equations.  Finally,  a 
clear  direction  worthy  of  further  research  can  be  the 
extension  of  our  investigation  to  the  full  information 
estimators  designed  for  estimation  of  simultaneous  equation 
models  with  autocor related  errors. 
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Appendix  I:  DERIVATION  OF  THE  LI ML  ESTIMATOR  IN  THE  PRESENCE 

OF  FIRST  ORDER  AUTOCORRELATION 

In  this  appendix  we  follow  closely  the  approach  used  by 
Koopmans  and  Hood ( 1 953 , pp . 1 92-95 ) .  We  also  use  their 
notation  to  make  our  derivations  easily  comparable  to 
the i r s . 6  6 

A.  Sargan  and  Amemiya  Version 

Consider  the  following  simultaneous  equation  system 
B  Y  +  T  X  =  U  (  1  ) 

U  =  R  U. ,  +  E 
Et  — 

E(e  t  )  =  0 
E(e ,6 , 1 )  =  I 

Following  the  practice  in  Amemiya ( 1 96  1  )  ,  Fair(1972),  and 
Dhrymes ( 1 972 ) ,  we  assume  that  R  is  a  diagonal  matrix  of  the 
form 

R  =  diag(r1/  r2,...,r&)  (2) 

(1)  can  be  transformed  to  the  equation  with  a  well  behaved 
disturbances  such  as 

BY+TX-RB  Y_ ,  -  R  T  X. !  =  E  (3) 

or 

A  Z  =  E  (4) 

where 

A  =  (B,  T,  -RB,  -RD 


6  8 Note  that  this  notation  is  only  adopted  in  this  appendix. 
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Z'  =  (Y\  X'f  Y.  i  •  ,  X.  ,  ’  ) 

Under  normality,  the  log-likelihood  function  of  all  the 
parameters  of  (4)  is 

L  =  K  +  Tlog  J  B | —  T/2  log | S | -  1/2  t  r  ( Z " 1 AMA ' )  (5) 

where  M=Z  Z'  and  Z  is  the  variance-covariance  of  Et. 

Since  estimation  of  the  first  equation  of  (4)  is  the 
focus  of  our  attention,  we  partition  A  as 

(6) 


A  = 


A,l 

i 

\  A  2 

■ 

IB 


4' 


where  A ,  is  the  first  row  of  A,  B,  is  the  first  row  of  B,  4s, 

is  the  first  row  of  (T,  -RB,  -RD  and  A2,  B2,  and  45  2  are 

defined  accordingly. 

The  corresponding  partition  of  I  is 

/  2  11  £  1  2  \ 

I 

I  =  (7) 

Z  2  1  Z22  / 

where  Z^  is  1.1,  Z12  is  1  .  ( G—  1  )  ,  and  Z22  is  ( G—  1  )  ( G—  1  )  . 

A,  and  Z,,  are  parameters  to  be  estimated  and  hence 
have  to  be  retained,  while  A2,  Z12=Z21?,  and  Z22  are 
parameters  to  be  eliminated  by  partial  maximization  of  (5). 
This  can  be  achieved  by  first  transforming  the  system  (4)  by 
a  non-singular  matrix  F  such  that 

f  1  , ,  o ,  \ 

F  =  (8) 

•  F  2  1  F  2  2 

where  F22  is  a  (G-I).(G-I)  non-singular  matrix.  This 


' 
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transformation  will  leave  the  first  equation  unaffected.  The 
transformed  variance-covariance  matrix  is 

£-11*  2- 1  2  *  \ 


v*  =  f  z  F' 


2  2 


*/ 


(9) 


where 


2  i  i *  =  Z i i  (10) 

Z 2 i ’ *  =  2 2 1  '  =  2 1  1  F2  i  '  +  ^  !  2 F 2  2 '  (11) 

2-22*  =  F  2  1  Z  1  1  F  2  1  '  +  F22I2  iF2  1  '  +  F21Z21F22' 

F22222P22'  (12) 

We  want  to  choose  F2i  so  that  Z2i*  =  0.  This  means  that 

F  2  1  T  =  21 1  1  1  Z  1  2  F  2  2  '  (13) 

and  substituting  (13)  into  (12)  yields 

Z22*  =  — F222i12,2i1  1  ^12  +  F22222F221  (14) 


We  can  use  our  freedom  of  choice  of  F22  so  as  to  make 
I22*=l22.  This  transformation  not  only  makes  the  error  terms 
of  the  last  G- 1  equations  independent  of  the  first  equation 
but  also  makes  them  independent  of  each  other  and  sets  their 
variances  equal  to  one. 


/M 

In  terms  of  the  new  parameters  Z*  and  A*=|  ,  the 

\A%  I 


matrix  product  under  the  trace  of  (5)  becomes 


/SiT1 

\  0 


+  A  2  *M  A  2  * ' 


(15) 


Using  (15),  the  likelihood  function  (5)  becomes 


187 


1/T  L(A*,L*)  =  K2  +  log j B*j -  1/2  logjl^l 

-1/2  tr(L11~1A1MA1')  -1/2  tr(A2*M  A2*')  (16) 

To  obtain  the  concentrated  likelihood  function  with  respect 
to  the  elements  of  the  first  equation,  we  have  to  partially 
maximize  (16)  with  respect  to  the  elements  of  the  second 
subset,  A  2  * ,  and  substitute  the  results  back  into  (16). 

The  only  terms  containing  the  elements  of  the  second 
subset  in  (16)  are  the  second  and  the  fifth  terms. 

Therefore,  partial  maximization  of  (16)  with  respect  to  the 
elements  of  A2*=(B2*,#2*)  will  only  affect  those  two  terms. 
Now 


i(  log  !b*|  ) / i)b*  i  j 

=  b*  j  1 

=  b  i  j  *  *  / 1  B  *  1 

(17) 

i— 

ii 

M 

• 

• 

•  ,  G  r  j  —  1  ,  .  .  .  ,  G 

Where  bj  j*  is  the  element  in  the  ith  row  and  jth  column  of 

B* ,  b* j '  is  the  element  in 

the  jth  row  and  ith  column 

of 

( B* ) “ 1  ,  and  b ;  i  **  is 

the  cofactor  of  bi  j*  and  j  B*  |  is 

the 

determinant  of  B*. 

Also  we  have 

1)  log  |b*|  /  )  4s  i  k  * 

O' 

vo 

o 

II 

i  -  2  ,  .  .  .  ,  G 

(18) 

k= 1 , . . . , G+2K 

1/2  [^)  t  r  ( A  2  *M  A 

A 2  * ] = 1 /2 [ 2  A2  *M] =A2  *M 

(19) 

(  17)  ,  (18)  and  (19) 

can  be 

combined  to  give 

)L/  l  A2*  =  { [ ( B* 

’  )-1]2 

0  }  -  A  2  *  '  M  =  0 

(20) 

Note  that  A2*  is  a  (G- 1 ) . ( 2G+2K)  matrix  of  the  coefficients 
of  the  second  subset. 


6 ’Note  that  (18)  holds  only  under  the  implicit  assumption 
made  by  Sargan  and  Amemiya.  We  shall  discuss  and  analyse 
this  assumption  later  in  this  section. 
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From  equation  (20)  we  can  calculate  A2*M  and  using  it 
to  evaluate  the  fifth  term  in  (16)  such  as  *./ 

th 

-1/2  t  r ( A  2  *M  A  2  * '  )  =  -1/2  t r { [ ( B* ’  )  ‘  1  ]  2  0}:  , 

,  W 

=  -  1/2  trllT_  =  const .  "  (21) 

Note  that  the  elements  of  the  second  subset  of  ( B * ) " 1 , 
i.e.,(B*)2_1  are  equal  to  b ,  j  **/ j  B*1  ,  where  b^**  is  the 
cofactor  corresponding  to  the  ith  row  and  jth  column  of  |  B*|. 
Thus  (21)  holds  since 

[ (B* '  ) "  1  ]  2 (B2* '  )  =  (b**/|B*l  ) (B2* '  )  (22) 

where  b**  is  the  matrix  of  the  cofactors  bjj**.  The  ith  row 
of  the  product  (22)  is  equal  to 

(  1/j  B*|  )  [  I  b  j  j  *  *  b  j  i  *  ]  =  (  1  /  |  B*|  )  (  |b*(  )  =  1  if  i  =  j 

=  0  ,  if  j  ( 23  ) 

Therfore,  (22)  is  equal  to 


M 

0 

0  ... 

1  1  o  ... 

(  i/Ib*I  ) 

0 

M 

0  0 

=  (  i/|b*|  )  (  |b*|  ) 

0  10.. 

\° 

0 

o 

O 

Gd 

* 

'.0  0  0  1  1 

=  I  (24) 

Note  that  the  sum  of  the  product  of  each  row  of  any  matrix 
by  its  corresponding  cofactors  is  equal  to  the  determinant 
of  that  matrix,  while  the  sum  of  the  product  of  each  row  of 
any  matrix  by  the  cofactors  corresponding  to  other  rows  is 
equal  to  zero. 

To  expand  and  evaluate  the  second  term,  log | B  *  J  ,  in 
(16)  we  partition  the  last  term  in  (20)  as 


Ito  it  I'l  ••  >** 


■ 


(25) 


A  2  *M  =  (B2*  #2  *  ) 

I 

\  M  t  y  M  QCf  I 


where 


M  =  Z 

Z  T  = 

1  YY’ 

Y4> 

/  M 

/  1 1  y  y 

My  ^  \ 

where 

<£>Y' 

$$  ’  i 

1 

V*v 

M<t ><$>  / 

<*>’ 

=  (X 

’  Y_  ! 

x-  r ) 

Thus  we  write  (20)  in  expanded  form  as 
[  (  B*  )  ]  2  =  B  2  ^My  y  +  4*  2  y 


0  —  B  2  -fcM  y  +  4s  2  -fcM  cj> 


Calculating  4»2*  from  (27)  as 


^  2  *  -  B  2  M 

and  substituting  it  in  (26)  yields 

[(B*’)"1]2  =  B2*[My  y-My4,M0fi>-  ’M^  ] 

where 


B  2  *W 


W  =  Y(I-$’  ($<*>')-  14>)Y’ 

Now  write  the  second  term  in  (16)  as:70 

log  |  B*  |  =  1/2  log  |  B*W  B  *  ’ !  -1/2  logjw| 

Using  (28),  we  can  evaluate  the  first  term  in  (29) 


(26) 

(27) 


(28) 


(29) 


70  Noting  that  since  B*  and  W  are  square  matrices  of  the 
same  order  we  have: 
det(B*  W)=  det ( B* ) . det (W) 
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B  !  *W  B*  ' ' 

/b,*w  B*’  \ 

\  B  2  *W  B* " 

{[  (B*  '  )  -  1  ]  2B*  ' 

B ! *W  B  ,  *  '  B ! *W  B2  *  '  \ 


0  I  2  2 

since  B,*  =  B1f  we  have 

log  |B*W  B*'j  =  logjB^W  B*'|  =  logiE^  B/j 

=  log (B ,W  B, ' ) 

Using  (21),  (29)  and  (31)  we  have  the  concentrated 

likelihood  function  in  terms  of  A,  and  I,,  as 

L(A1,Z11)  =  K  +  1/2  log ( B , W  B , ) -  1/2  log|w|-1/2 
-1/2  tr  (Z,  r  ’A,M  A/) 

If  we  partially  differentiate  (32)  with  respect  to 
that  It!  is  an  scalar)  we  will  get 

-1/2  ( 1/  Li i )  +  1/2(1/I112)(A1MA1')  =  0 


therefore 

2i !  =  A ! M  A 1 ' 

Substituting  L  ,  in  (32),  we  will  get 

L(A,)  =  K*  +1/2  log ( B  t W  B  t '  )  -1/2  log|w| 

-  1/2  log ( A  t  M  A,') 

Now  write  the  first  transformed  structural  equation 
B !  *Y !  +  r1*X1  -  r  B^Y,  -  r  =  e, 


or 


( B  !  * 


€  1 


(30) 


(31  ) 

log  1 2 , ! | 
(32) 
!  i  (note 


(33) 

(34) 
as 

(35) 


and  note  that 


' 
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A1  =  (B1*  0  4> ,  *  0) 


therefore,  the  last  term  in  (34)  can  be  partitioned  as 

A,M  A,’  =  ( B !  *  ❖1*)M1(B1*  4»,*)'  =  6,6/  (36) 

where 

'y,Y,'  Y,4>,  '  \ 

M  i  = 

l  <*>  ,  Y  !  '  $  ,  <J>  ,  '  , 

Therefore,  the  concentrated  likelihood  function  (34)  becomes 
L(B1,T1)  =  K*  +  1/2  log ( B , W  B,')  -  1/2  logjw! 

-1/2  log(e1e1 ’ )  (37) 

Equation  (37)  is  exactly  the  same  as  equation  (14)  of 
Amemiya ( 1 966 )  except  he  considered  the  third  term  as  part  of 
the  constant  term. 

However,  the  validity  of  (37)  is  on  the  basis  of  the 
validity  of  the  partial  derivatives  given  in  (17)  to  (19). 
But  in  general  (18)  does  not  hold.  This  is  because 

4*2*  =  [r*,  (-RB)*,  (-RD*]2  (38) 


Since  the  second  term  in  $2*  is  a  linear  function  of  the 
elements  of  the  second  subset  of  the  B  matrix,  the  partial 
derivative  (18)  will  be  equal  to 


l)log  B*  jO  ...  0  |log  B*  0...0 


0  ...  0  J(-RB)*  |  0...0  / 


(39) 


i  =  2  ,  .  .  .  ,  G 
k  =  1 , . . . , G+2K 


(39)  demonstrates  that  unless  we  ignore  all  the  informations 
contain  in  +2*,  we  can  not  derive  the  Sargan  and  Amemiya 
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ALIML  estimator.  This  is,  in  fact,  the  assumption  made,  but 
not  explicitly  explained,  by  Sargan ( 1 96 1 , p . 4 2 1 ) ;  this 
assumption  does  not  affect  the  consistency  of  this 
estimator.  However,  its  effect  on  the  asymptotic  efficiency 
of  the  Sargan  ALIML  is  not  addressed  by  Amemiya. 

B.  An  Alternative  Approach  to  the  Derivation  of  LIML  in  the 
Presence  of  Autocorrelation 

An  alternative  way  of  deriving  the  LIML  in  the  presence 
of  autocorrelation  can  be  based  on  the  restriction  imposed 
on  the  structure  of  the  R  matrix.  Instead  of  ignoring  the 
information  about  the  structure  of  the  augmented  reduced 
form  equations  other  than  the  one  under  consideration,  we 
can  assume  that  the  coefficients  of  autocorrelations  are 
equal  across  equations.  Under  this  assumption  we  can  derive 
a  LIML  estimator  which  we  shall  refer  to  as  the  generalized 
LIML(GLIML).  The  derivation  of  GLIML  is  as  follows.  Consider 
system  ( 1 )  again 

B  Y  +  T  X  =  U  (40) 

U  =  R  U. i  +  E 
E  (  E  t  E  t  '  )  =  L 

Assuming  equal  coefficients  of  autocorrelation  across 
equations,  we  transform  (40)  with  the  aid  of  the  (T-I).T 
Cochrane-Orcut t  transformation  matrix  Q 


( 


-r  1  0  0 


\ 


Q 


0  -r  1  0  0 


(41  ) 


0  0  0  0  -r  1  / 
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(42) 


to  the  system  free  of  autocorrelation 
B  Y  Q '  +TXQ'  =  U  Q '  =  E 
or 

B  Y  +  r  X  =  E 
or 

A  Z  =  E 

where 

A  =  (  B  ,  D 

Z '  =  (Y'  ,  X'  ) 

The  logarithmic  likelihood  function  of  the  system  (42)  is 

L  =  K  t  +  T  log  t  B | -  T/2  log |l | -  1 /2  t  r ( L '  1 AMA  ?  )  (43) 

where 

M  =  Z  Z’ 

Since  the  estimation  of  the  first  equation  of  (40)  is  the 
focus  of  attention,  we  partition  A  and  X  such  as 

At  is  1 . ( G+K )  matrix 


A  = 


h  - 

b  ,  r ,  \ 

1  ~ 
\a2  1 

1b  2  r  2 

I, , 

I  ,  2  1 

Vz  1 

/ 

I22  / 

A2  is  (G-1)(G+K)  matrix 


I  = 


At  and  Xti  are  parameters  to  be  estimated  and  A2,  Xi2=Z2i' 
and  L22  are  parameters  to  be  eliminated  by  partial 
maximization  of  (43).  To  do  this  we  need  to  transform  system 
(42)  by  a  non-singular  matrix  such  as  F 


V 


1  94 


F  = 


0 


\F21  f2 


Transformation  of  the  system  (42)  by  F  produces  a  new 
variance-covariance  matrix  such  as 


I  ^  i  i  * 


£  1  2  *  \ 


Z*  =  f  Z  F' = 


where 


(44) 


2  2  1  *  Z  2  2  ^ 


L  i  i  *  =  Z  i  i 

2  2  1  *  '  =  £l2*  =  £llF2l'  +  22  1 F2  2  '  (45) 

Z22*  =  F  2  1 2  !  T  F  2  1  '  +  F  2  2  2  2  1 F  2  1  '  +  F2  1^2  1^2  2  ' 

+  F2  2Z2 2F2  2 '  (46) 

We  will  choose  F2i  so  that  making  Z12*  =  0.  This  means 

F  2  1  *  =  1  1  '^12^2  2*  (47) 

substituting  (47)  into  (46)  yields 

^2  2  *  =  f  2  2  2  2-^2  1^11  1  ^  1  2  1  F  2  2  '  (48) 

We  will  choose  F22  so  that  making  Z22*  =  1 22 •  Using  the 
above  results,  the  last  term  in  (43)  becomes 

L 1  1  1  0  At 


Z '  1  AMA  '  =  M[A/  A  2  *  ’  ] 

\  0  I  2  2  i  A  2  * 

=  Z ' 1 A , M  A,  '  +  A2*M  A2*'  (49) 

Substituting  (49)  into  (43)  yields 

1 /T  L*  =  K  2  +  log  j  B*|  -1/2  log  (Z-,  -,j  — 1/2  tr(Z11'1A1M  A,') 

-  1/2  tr(A2*M  A2*')  (50) 


To  obtain  a  concentrated  likelihood  function  in  terms  of  the 
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elements  of  the  first  equation,  we  have  to  eliminate  A2* 
from  (50)  via  a  process  partial  maximization.  Contribution 
to  (50)  from  the  elements  of  the  second  subset  comes  from 


the  second  and  fifth  terms.  Therefore 


j)log  B* 


b  i  j  ** 


i  -  2  ,  .  .  .  ,  G 


(51  ) 


=  b*  j 


I  __ 


(52) 


3  b i  j  *  j  B* |  j=1  ,  .  .  .  ,G 

where  bjj*,  b*  J  '  and  bjj**  are  defined  previously. 

))  log  '  B*  i  /  ()  F  ,  k  *  =  0  ,  i  =  2  ,  .  .  .  ,  G 

k= 1  ,  .  .  .  ,  K 

Unlike  equations  (18),  (52)  holds  unconditionally,  since 

T j k *  does  not  include  any  of  the  elements  of  Bjj*. 
t  r  ( A  2  *M  A  2  *  ’  ) 

-  =  2  A2*  M  (53) 

))  A  2* 

Combining  (51)  and  (53)  yields 

- =  {[(B*')_1]2  0}  -  A2  *  M  =  0  (54) 

|a2  * 

Using  the  result  in  (54),  the  fifth  term  in  (50)  will  become 


-1/2  t  r ( A  2  *M  A2*’)  =  -1/2  tr{[(B*’)2_1  0] 


/  b  2  * '  n 
\  / 


=  -1/2  tr(lf. i )  =  const.  (55) 

To  evaluate  the  second  term  in  (50),  we  partition  the  last 


term  in  (54)  as 
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M~~ 

liy  y  A  J.  y  x 

A2*M=[B2;i':r2*]  (56) 

1  M-x  y  M  x  x  I 

where 

Y  =  YQ' , . . . 

Therefore,  we  can  write  (54)  as 

[  (B*  ’  )  -  1  ]  2  =  B2*  Myy  +  r2*  Mxy  (57) 

0  =  B2*  MyX  +  r2*  Mx  x 

The  second  equation  in  (57)  gives 

r2*  =  -B2MyxMxx- 1  (58) 

Substituting  (58)  into  the  first  equation  in  (57)  yields 

[  ( B*  ’  )  ■  1  ]  2  =  B  2  * [ M  7  y  -  =  B2*  W  (59) 

where 

W  =  Y ( I -X ' (XX' ) ' 1 X ) Y ' 

Analogue  to  (29),  the  second  term  in  (50)  can  be  written  as 
log  | B* | =  1/2  log  |B*W  B*’(-  1/2  logfwi  (60) 

Using  (59),  we  can  evaluate  the  first  term  in  (60) 


( B  t  *W  B*  M  B ! *  W  B*  '  \ 


B*W  B* '  = 


3  2  *W  B*'/ 


\  [  (B*  '  )  '  1  ]  2B*  '/ 


B ! *W  B i * ’  B  t  *W  B  2  * ’ 


(61  ) 


0 


2  2 


Note  that  B,*  =  B1f  therefore 

log  |b*W  B*’|  =  log  |b  !  W  B  •,  '  |  =  log(B1W  B,') 
Combining  (50),  (55),  (60)  and  (62),  the  concentrated 

likelihood  function  interms  of  A,  and  will  become 


(62) 
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L(AlfZn)  =  K  +1/2  log(BiW  B/)-1/2  log|wj 

-  1/2  log  |  Z !  J  -1/2  trdn-'AtM  A,')  (63) 

Partial  differentiation  of  (63)  with  respect  to  yields 

A 

S  n  =  A |M  A )  '  (64) 


Substituting  (64)  in  (63),  we  shall  have 

L(A,)  =  K*  +1/2  log(B1W  B ,  '  ) -  1 /2  log|w| 

-  1/2  log(A,M  A/)  (65) 

Partitioning  of  the  last  term  in  (65)  with  respect  to  the 
coefficients  of  the  first  equation  that  are  non-zero  and 
those  which  are  zero  yields 


At  =  [Bi*  0  T,*  0] 

A  i  M  A,'  =  (  B  t  *  T  ,  *  )M,  (B  ,  *  T  -|  *  )  ’ 

I YtYt '  Y i X i ’ 

M  i  = 

'  XtYt  '  X , X i  '  / 

Therefore 


(66) 

(67) 

(68) 


A,M  A/  =  (Bi*  Y,  +  r  i  *  X1)(B1*  Y,  +  T,*  X ,  )  ’  (69) 

Using  (69)  and  noting  that  B1*=B1  and  r1*=r1,  the 
concentrated  likelihood  function  (63)  becomes 

L(B , * ,  T,*)  =  K*  +  1/2  log (B ,W  B,  '  )  —  1 / 2  log j W  ( 

-1/2  log ( B  t  Y,  +  r,  X1)(B1  Y,  +  r,  X,)'  (70) 

Note  that  Bt  is  the  first  row  of  the  B  matrix. 

Maximization  of  (70)  with  respect  to  Bt  and  Tt  for  a  given 
value  of  r  yields  the  GLIML  estimates  of  the  parameters  of 
the  first  equation.  Note  that  we  have  to  vary  "r"  between 
( —  1 ,  + 1 )  and  choose  the  estimates  of  B,  and  T,  which 
maximizes  (70).  Derivation  of  an  explicit  formula  for  the 
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estimate  of  B,  and  along  with  the  asymptotic  distribution 
of  GLIML  are  addressed  in  the  chapter  II. 

C.  Estimation  of  the  Autoregressive  System  using  all  T 
observat ions 

Extention  of  GLIML  to  the  case  using  all  T  observations 
is  straightforward.  However,  this  is  not  true  with  Sargan 
and  Amemiya's  method.  Due  to  the  form  of  their 
transformation,  their  method  cannot  be  extended  to  the  case 
using  all  T  observations. 

Consider  system  (40)  again 
B  Y  +  T  X  =  U 
U  =  R  U_  -i  +  E 

Assuming  equal  coefficients  of  autocorrelation  across 
equation,  we  can  transform  (71)  with  the  aid  of  the 
Pra i s-Wi ns t en  transformation.  The  transformed  system  will  be 
B  Y  P'  +T  X  P'  =  U  P'  =  E  (72) 

Derivation  of  the  GLIML  on  the  basis  of  the  transformed 
variables  in  (72)  is  the  same  as  the  one  for  system  (42) 
except  for  the  additional  term  which  is  due  to  the  Jacobian 
of  the  transformation.  However,  as  we  discussed  earlier  the 
omission  of  the  Jacobian  part  of  the  likelihood  function 
does  not  materially  affect  the  performance  of  the  estimator. 
Due  to  this  fact  and  also  the  computational  problems  that 
inclusion  of  the  Jacobian  of  the  transformation  creates,  we 
have  omitted  it  from  the  concentrated  likelihood  function 
associated  with  GLIML  when  it  uses  the  Pra i s-Winsten 


* 


transformation 


APPENDIX  II:  COMPUTATIONAL  STEPS  OF  DIFFERENT  ESTIMATORS 


In  this  appendix  we  shall  first  outline  the  computational 
steps  and  the  computer  programs  used  for  the  estimators 
designed  for  systems  without  lagged  endogenous  variables. 
Then  the  computational  steps  and  the  computer  programs  of 
the  estimators  for  dynamic  autoregresi ve  systems  will  be 
outlined.  In  all  the  APL  programs  the  following  variables 
are  global. 

The  exogenous  variables  which  are 


X  ^  i  p  X  2  2  i  X  s  3  (  X  4  4  f  X55,  X6  6 

Z  =  (  X  !  1  ,  X  2  2  r  X  3  3  ,  X  4  4  ,  X55,  X  6  6  r  C  ) 


where 


C  is  a  constant  term  which  is  a  vector  of  ones. 

YDATA  is  a  matrix  which  contains  all  the  observations 
on  the  endogenous  variables. 
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S3 


A.  Systems  with  no  Lagged  Endogenous  Variables 

1.  Fair's  Estimator 

yi  =  Y !  (3 ,  +  X,7,  +  u,  =  Z  -j  6  i  +  u, 

1  .  *  =  (X,  X,,.,  y  i  ,  _  i  Y,,.,) 

Denote 

A  A 

z  ,  =  (Y,  X , ) 

where 

A 

Y,  =  1 ^ ' Y 1 

A  A  A  a 

6 i  =  (Z1’Z1)-1Z1fy1 

A  A 

2.  u,  =  y  !  -  Z  -i  6  ! 

A  'A  ,  A  .  A 

r  =  I  u,  u  i  ,  .  ■,/  I  u,,-!2 
X  X 

3.  Transforming  the  structural  equation 

Y  i  =  y i  ~  ry  !  ,  .  , 

X i  =  X,  -  rX1;_i 
Y i  =  Y,  -  rY, , . i 
Denote 

A  A  __ 

z,  =  (Y,  X , ) 

A  A  A  A 

4.  by  =  (Z1'Z1)"1Z1'y1 

5.  Iterate  steps  2,  3,  and  4  until  convergence. 

To  calculate  the  variance-covariance  matrix  of  6 

a.  _  —  A 

6.  u,  =  y,  -  Z !  5  , 

a  y  1  =  Zu !  1  u  i  /  ( T  —  K  ) 

A  A  ± 

Var(61)=a11(Z1'Z1)"1 

Note 

K  =  (H  -  1  )  +  J 


Where 
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H  =  Number  of  included  endogenous  variables. 
J  =  Number  of  included  exogenous  variables. 
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V  FAIR  1 

[  1]  AFC0EF1 *  0  7  pO 
[  2]  AVFCOEF*  0  5  pO 
[  3]  AFB  AD1*0  p  0 
[  4]  II*- 1 

[  5]  Ll:Yl*-YDATAl(30x{II-l)  )  +  i30;l] 

C  6]  Y2+YDATA[(30*(II-1) )+i30;2] 

[  7]  Y3<-YDATAi(30*(II-l)  )  +  i  3  0  ;  3  ] 

[  8]  I  LI*-  "1  0  4Y1,Y2,Y3,X11,X44,(7 
[  9]  ZR* ( 1  0  4Y11  ,  Y44  ) ,771 

[10]  Y  2H+-ZR  +  .  x  ( 1 4.  Y2  )BBZB 

[11]  Y3H*ZR+  .  x  ( i  +  Y3  )gZB 

[12]  B1«-(14Y1  )E(Y2tf  ,Y3tf ,  1  0  111) 

[13]  B^7<-0 

[14]  72:B1^B 

[15]  7?^(14£’)E“14£’^Y1-  (Y2,Y3,I1)  +  .xB1 

[16]  /IFB^Ol^FBylDl  ,  77x  i  (B>1  ) 

[17]  /?*-(/?,  0  .  99  )  [  1+B>1  ] 

[18]  X1R+{1  0  4Yl)-7?x  "1  0  411 

[19]  Y 1R*-  (14Y1)-/?x_14Y1 

[20]  Y  2R*-  (14Y2)-Bx-14'Y2 

[21]  Y  3  R*-  (14Y3)-7?x"14'Y3 

[22]  Y  2HR*Y 2H - R*~ 1 \Y  2 

[23]  Y3HR*Y3H-R*~ 1 4Y3 

[24]  B  1«-Y  1 BtEK Y  2HR ,Y  3  H  R , X 1 R) 

[25  ]  ^7  2  x  i  (  20>7+-7  +  l  )a(0.005<|B-B1) 

[26]  4FB£BF1^F7’7BF1  ,  [1  ]  (  B 1  ,F,  I) 

[27]  B  RH*Y 1R- (Y2R ,Y  3R ,X1R)  +  . xfi 1 

[28]  S2+-  (  +  /  U  RH  *  2  )  v  2  6 

[29]  7^B^52xg(^(Y2BB, Y3HR,X1R) ) + . *Y 2H R ,Y 3 H R , X 1 R 

[30]  AVFCOEF*AVFCOEF  ,  [  1  ]  74F  [  1  ;  1  ]  ,  VAR  [  2  ;  2  ]  ,  VAR  [  3  ;  3  ]  ,  VAR  [  4  ;  4 
] ,VARl 5; 5] 

[31]  +71xi  (l00>77«-77+l) 

[32]  '***END' 


V 
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2.  Fair-Brundy  and  Jorgenson  Estimator 

y 1  =  Y,B,  +  X,7,  +  U,  =  Z  T  6 ,  +  u. 

The  augmented  reduced  form  is 

Y  t  =  -X  t  TB  ~  1+Yt .  ! BRB '  1 +X  t  _  , TRB '  1 +E  t  B '  1 
or 

y  t  =  xtn1+Yt.1n2+xt.1n3+v 

1.  Estimate  all  structural  equations  by  2SLS  and  use  the 
estimated  coefficient  to  form  a  consistent  estimate  of  IT,, 
n2 ,  and  n3.  Also  from  the  estimated  residuals  obtain 
estimates  of  the  autocorrelation  coefficients  and  form  the  R 
matrix.  Then  obtain  the  fitted  value  for  Y,  as 

y  i  =  xtn,  ,+Yt  -  ,n2,+xt.  ,ri3 , 

where 


A 

A 

A 

III 

=  r 

B  ‘ 

1 

A 

A 

A 

A 

rr2 

=  B 

R 

B  ”  1 

A 

A 

A 

A 

n3 

=  r 

R 

B’  1 

A 

a  p 

2.  u i  =  y i  ~  Z1o1 

/A 

where  5 ^  is  estimated  in  the  first  step. 

A  T  A  A  r  A 

r  =  I  u,  '  u,  ,  _  -,/T  u  i  -  i  2 
X  A 

3.  Use  r,  transform  the  first  structural  equation  as 

y^ry,,.,  =  (Y,-rY,  ,  .  ,)B,  +  (X , -rX ,  ,  _  ,  )  7  1  +  U 
Define 

Z,  =  [(YrrY,,.,)  (X  ,  -rX ,  ,-,)]• 

_  A 

y 1  =  y 1  -  ry1(-! 

W 1  =  [  ( Y  -j  —  r Y  1  ,-1)  (X  !  -rX,  ,  -  1  )  ] 

4.  6  ,  =  (W/Zj-'W/yi 

5.  Iterate  steps  2,  3,  and  4  until  convergence. 


- 


■ 

c  l> 


To  calculate  the  variance-covariance  matrix  of  6 

___  _ _  A 

i  =  y  i  -  z !  5 1 

A  If  A 

o i ,  =  Zu, ' U1/(T-K) 

A  _ 

Var  (  5  !  )  =  aMw/Z,)-1 


' 
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V  FBRUNDY 
[  1]  11+ 1 
[  2]  AFB B  AD+0  p 0 
[  3]  AFBCOEF+  0  7  pO 
[  4]  4  0  5  pO 

[  5]  LL1 :Y 1+Y  DAT  A [ (  3  0  x {II  -  1 ) )  +  \  3  0  ;  1] 

[  6]  Y2+YDATAI  (30x(U-i)  )  +  x  3  0  ;  2  ] 

[  7]  Y3+Y£i47M[  (30x(U-i)  )  +  i30;3] 

[  8]  Y 1H+Z + . x y i@z 
[  9]  Y2H+Z+ . xY2®Z 

[10]  Y3H+Z  + . xY3®Z 

[11]  fll«-Yl®(Y2tf  ,Y3tf  ,X1) 

[12]  B2+Y 2@( Y 1£ , X2 ) 

[13]  B3+Y 3® ( Y2H , X3 ) 

[14]  J?l+(  1+£)B~  14-B+Y1-  (Y2,Y3,Xl)  +  .xfli 

[15]  /?2^(14'£’)®"1  iE+Y 2  -  (Y1,X2)  +  .x52 

[16]  £3<-(14-£)S‘l+£«-Y3-  (Y2,Y3)+.xfi3 

[17]  £11+-£1[1] 

[18]  £  1  2«- -  £  1  [  2  ] 

[19]  £21«--£2[1] 

[20]  £31<--£3[1] 

[21]  R+  3  3  p£l  ,  0  ,  0  ,  0  ,R2  ,  0  ,  0  ,  0  ,£3 

[22]  BB+  3  3  pi  ,£11 ,£12 ,£21 , 1 , 0 , 0 ,£31 , 1 

[23]  Q1+  1  7  p£l [ 3 ] , 0 , 0 , £1 [ 4 ] ,  0  ,  0  ,  £1 [ 5 ] 

[24]  Q2+  1  7  p  0  ,  B  2  [  2  ] , 0 , £  2 [ 3 ] , 0 , £  2 [ 4 ] , £  2 [ 5 ] 

[25]  Q3  +  1  7  p  0  ,  £  3  [  2  ] , £  3 [ 3 ]  ,  0  ,  £  3  [  4  ]  , 0 , £  3 [ 5 ] 

[26]  £+£l , [1]  Q 2,  [1]  Q 3 

[27]  Pl+(  t±J£  B)  +  .xj?+.xBB 

[28]  P2+{  El££  )  +  .  x  ^ 

[29]  P3+(@££) + . xp+ . x£ 

[30]  YY+§ ( 3  30  p(Yl,Y2,Y3) ) 

[31]  Y  2  H+  (Pl[2;]+.x^>(“i  o  1 Y  Y  )  )  +  ( P2  [  2  ;  ]  +  .  x$>  (  l  o  +Z))-P3[2;] 
+  .  x<$>  ‘  l  o  4  Z 

[32]  Y3£«-(P1[3  ;  ]+.x$Ci  0  4-  YY  )  )  +  (P2  [  3  ;  ]  +  .  x$>(  l  o  4Z))-P  3[3;] 
+  .  x  $  "  i  o  4- Z 

[33]  1+0 

[34]  L1:RR+R1 

[35]  X1R+(1  0  4Zl)-i?lx  "1  0  4Y1 

[36]  Y 1 R+  (14'Y1)-£1x"14'Y1 

[37]  Y  2HR+Y 2H - Rl*~ 1 \Y  2 

[38]  Y  3HR+Y  3£-B1x“14-Y3 

[39]  Y  2  (11Y2)-£1x~14-Y2 

[40]  Y  3  (14-Y3)-£1x_14-Y3 

[41]  W+Y2HR, Y3HR, X1R 

[42]  M+Y2R, Y3R,X1R 

[43]  ££l+(®(  (W)  +  .xW)  )  +  .x(  (W)  +  .xYl£) 

[44]  Rl+( IIP)®- 14P+Y1- (Y2,Y3,Yl)+.x££l 

[45]  AFBBAD+AFBBAD.II* \ (B1>1 ) 

[46]  B1+(B1  ,  0  .  99  )  [1+£1>1  ] 

[47 ]  +11 x i ( 20> I+I  +  l ) a (  0 . 0  0  5  < | RR - Rl) 

[48 ]  AFBCOEF+AFBCOEF , [ 1 ] ( ££ 1 , Rl , I ) 

[49]  UH+Y1R- (Y2R,Y3R,X1R)+ .*BBl 

[50]  S2+(+/UH*2 )*26 

[51]  VAR+S2x\B(  (W)  +  .  x«) 
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[52] 

[53] 

[54] 


AVFBR+-AVFBR ,  [  1  ] 
R  [  5  ;  5  ] 

-+LL1  x  i 
’  *  *  *  END  ' 


VARll  ;  1]  ,  74J?[2  ;  2]  ,  ;  3  ]  ,  VARlH  ;  4  ]  ,  VA 
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3.  Modified  Brundy  and  Jorgenson  Estimator(T 
Observat i on ) 

y  i  =  Y  i  B  i  +  X  ,  5  ,  +  u  i  =  Z  i  5  !  +  u  | 

Using  the  ordinary  reduced  form 

y  =  -x  r  b ~ 1  +  u  b  ~ 1  =  x  n,  +  v 

1.  Run  2SLS  on  all  the  structural  equations  and  use  the 
estimated  coefficients  to  form  a  consistent  estimate  of  n . 
Then  obtain 

A  A 

y  i  =xnn 

A  A 

2.  u,  =  y  ,  -  Z  •(  6  i 

A  T  a  >  /\ 

r=  lu/u,  X  u,  _  i  2  ( Prais-Winsten  formula) 

l  3  ' 

3.  Transform  the  first  structural  equation  using  the 
Prais-Winsten  transformation  matrix  and  denote 

.  A  A 

z,  =  (P  Y,  P  X,) 

A  A  A 

W,  =  (P  Y,  P  X, ) 

A 

Where  P  is  constructed  using  estimated  autocorrelation 
coefficient  in  the  second  step. 

4.  6 !  =  (W, ’ Z ! ) " ’W, ’ (Py , ) 

5.  Iterate  steps  2,  3,  and  4  until  convergence. 

Calculation  of  variance-covariance  matrix 


A  A 

A  A  A 

6.  u i  =  P  y 1  " 

(P  Y,  P  X1 )8 

0 

A  A 

a  i  i  -  Xu i 

’ Ui/(T-K) 

A 

t 

Var ( 8  t )  = 

CF  ,  ,  (W,  ’  Z  ,  )  ■  1 
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V  BRUNDY 
[  1]  11+ 1 

[  2]  ABBC0EF+  0  7  pO 
[  3]  ABBAD*-OpO 
[  4]  AVBR+  0  5  pO 

[  5]  LL1  :  Y  1-t-Y  DAT  A  [  (  3  0*  ( 1 1  -  1  )  )  +  i  3  0  ;  1  ] 

[  6]  Y  2+YDAT  A\_  (  3  0  x  (II  -  1  )  )  +  i  3  0  ;  2  ] 

[  7]  Y  3  +Y  DAT  A l (30X(JJ-1) )  +  \  3  0  ;  3  ] 

[  8]  Y 1H+Z+ . xy 

[  9]  Y  2H+Z  +  .  xy  2|±jZ 

[10]  Y  3  H+Z+ . x  Y  3@Z 

[11]  Bl«-YlE(  Y2£,Y3£,Y1  ) 

[12]  B2  +  Y  2@  (  Y 1 F  ,  Y  2  ) 

[13]  £3<-Y3E(Y2£,  X3  ) 

[14]  £11«--£1[1] 

[15]  £12«--£1[2] 

[16]  £21«--£2[1] 

[17]  £31«--£3[1] 

[18]  BB+  3  3  pi ,£11 ,£12 ,£21 , 1 , 0 , 0 ,£31 , 1 

[19]  Q1+  1  7  p  £ 1 [ 3 ] ,  0  ,  0  ,  £  1  [  4  ] ,0,0,£1[5] 

[20]  Q2+  1  7  pO ,£2 [2] , 0 ,£2 [ 3 ]  , 0  ,£2 [4 ] ,£2 [ 5 ] 

[21]  Q 3+  1  7  p0,£3[2],£3[3],0,£3[4],0,£3[5] 

[22]  Q+Ql , [1]  Q2t  [1]  Q 3 

[23]  P+mBB)  +  .xQ 

[24]  Y2H+PL2  ;  ]+  .  x  (<$>Z) 

[25]  Y 3 H+P [  3  ;  ]  +  .  x  ( <S?Z  ) 

[26]  R+R1+I+0 

[27]  L1:R1+R 

[28]  R+(l±E)B~UrE+Yl- (Y2,Y3,Xl)+.x£l 

[29]  ABB  AD+ABB  AD , 1 1* \ (R>1 ) 

[30]  R+ (£,0.99)[1+£^1] 

[31]  X1R+(1  0  +Yl)-£x  "l  0  111 

[32]  Y 1R+  (14-Yl)-£x'14Yl 

[33]  Y2HR+(liY2H)  -Rx~  UY2H 

[34]  Y  3  HR+  ( 1 4-  Y  3  £ )  -  R*  ~  1  ±Y  3  H 

[35]  Y  2R+  (14-Y2)-£x-14-Y2 

[36]  Y  3  R+  (14-Y3  )  -  R*~  1-i-Y  3 

[37]  W+-Y2HR,  Y3HR,X1R 

[38]  M+Y2R, Y3R, X1R 

[39]  £1<-(E(  (W)  +  .xM)  )  +  .x(  (W)  +  .xyi£) 

[40  ]  ->Llx  x  (  20>I«-I  +  1  )a(o.005<|£-£1) 

[41]  ABBCOEF+ABBCOEF ,  [1]  (£1  ,R,I) 

[42]  UH+Y 1R- (Y2R,Y3R,X1R)+.*B1 

[43]  S2^( +/UH*2 )*26 

[44]  7Z£*-52x@(  (W)  +  •  xM) 

[45]  AVBR+AVBR,  [1  ]  7i4£  [  1  ;  1  ]  ,  VAR  [  2  ;  2  ]  ,  VAR  [  3  ;  3  ]  ,  VAR  [  4  ;  4  ]  ,  VAR  [ 
5  ;  5] 

[46]  +LL1*\ (100>II+II+1) 

[47]  '***END' 


V 
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4.  Modified  2SLS 


y i  =  y,b 


x  !  7  1  +  u  1  =  Z  ,  6  ,  +  u  i 


1 .  Y1  =  X(X’ X)  1 X ' Y , 

Denote 

A  A 

z  1  =  (Y,  X,) 

-A  A  A 

§ i  =  (Z1,Z1)"1Z1’y1 

A  A 

2.  u,  =  y i  -  Z  ,  5  , 

A  XA  A  /  Xa 

r  =  Xu ,  u , , . ,/  Lu ,  ,  2 

X  x 

3.  Transforming  the  structural  equation  as 

—  A 

y  i  =  y  i  -  r  y  ,  ,  .  -i 
x,  =  X,  -  r  x1(., 

Yi  =  Y;  -  ?  Y, , - , 

A 

Y,  =  X(X' X)  1  X  ’  Y  , 

where  X  =  Q  X,  and  Q  is  the  CORC  matrix. 

4.  Denote 

A  A  _ 

z"  i  =  (Y,  X,) 


Therefore 

A 


A  A 


A 


6,  =  (Z/Zj-’Zhy, 

5.  Iterate  steps  2,  3,  and  4  until  convergence. 
Calculation  of  the  variance-covariance  matrix  of  6 

A  __  —  —  A 

6.  u,  =  y  ,  -  ( Y  ,  X ,  )  6  , 

A  T  A  A 

a  ,  ,  =  Xu,  '  u,/(T-K) 

A  X  A  A 

Var (6 , )=a11(Z,’Z1)-1 
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V  TSLS 

[  1]  ATSCOEF+  0  7  pO 

[  2]  AT  SB  AD^O p  0 

[  3]  AVTSLS+  0  5  pO 

[  4  ]  1 1  <- 1 

[  5]  III  :Y  1+-YDAT  Al  ( 3  0  x ( II - l )  )  +  i  3  0  ;  1 '] 

[  6]  Y  2<-Y  D  AT  A[_  (  3  0  x ( II - l )  )  +  \  3  0  ;  2  ] 

[  7]  Y3+YDAT  Al ( 3  0  x ( II - l ) )  +  \  3  0  ;  3  ] 

[  8]  Y2IVZ+.xY2@Z 

C  9]  Y  3H+-Z  +.xy  3gZ 

[10]  B1^Y1@(  Y27/,  Y  3  77  ,  Yl  ) 

[11]  7?<-I«-0 

[12]  Il:7?l«-7? 

[13]  7?*-  ( 1 4-1’ )  1 4-I’^Y  1  -  (Y2,Y3,Y1)  +  .xB1 

[14]  TSBAD+TSBAD , II*\ (R> 1) 

[15]  /?-*-  ( /? ,  0  .  99  )  [  1+R>1  ] 

[16]  Y17?«-(  1+Y1  ) -/?x-  14-Y1 

[17]  Y  2F*-  (14'Y2)-7?x“14-Y2 

[18]  Y  3  7?-*-  ( 14-Y3  )  -  7?x~14-Y3 

[19]  ZR<-(1  0  4-  Z  )  - 7?x  'I  o  iZ 

[20]  Y2HR+ZR+  .  xY27?@Z7? 

[21]  Y3HR^ZR+  .  xY37?®Z7? 

[22]  Y17?«-(l  0  iXl)-7?x  "i  o  4Y1 

[23]  B1^Y17?@(  Y2HR,Y3HR,  X1R) 

[24  ]  -xllx  i  (  20 >1^1+1  )a(0.005<|7?-7?1) 

[25]  ^fSFFFT^IFI’FFF,  [1]  (51 , 7? ,  I ) 

[26]  UH-*-Y  17?  -  (Y27?,Y37?,J17?)  +  .xSi 

[27]  F2^( +/UH*2 )*26 

[28]  7  i4  F«-S  2  x£f|  ( <$)  (  Y  2  FT? ,  Y  3  777? ,  Y 1 7? )  )  +  .xY2  7/7?,Y3FF,Y1F 

[29]  i4  7TFIS«-j4  7IFLF ,  [1]  7,47?  [  1  ;  1  ]  ,  7/17?  [  2  ;  2  ]  ,  VAR  [  3  ;  3  ]  ,  VAR  [  4  ;  4  ]  , 
VARl 5 ; 5 ] 

[30]  -+LL1X!  ( 100>I7<-II+1 ) 

[31]  ’ ***END ’ 
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5.  The i 1  G2SLS 


Yi  =  YiB,  +  X,7,  +  Ui  =  Z  t  5  +  u  i 

A  A 

1 .  Y,  =  X  n,  =  X(X'X) ” 1  X  '  Y  , 


Denot 

0 

A 

7 

^  1 

A 

=  (Y, 

x, ) 

A 

A 

A  A 

5, 

=  (z. 

'  Z  ,  )  "  1  Z  ,  '  y  , 

~  A 

2  .  u , 

= 

y  i  - 

A 

z  ,  5  , 

A  7  A  A  1  A 

r  =  Xu, ’u,  Xu,  _,2  ( Pra i s-Wi nst en ) 

X  3 

3.  Transforming  the  first  equation  with  the  Pra i s-Winsten 
transformation  matrix  P 

y  1  =  Y , B ,  +  X i 7 ,  +  e, 

A  A 

Y,  =  X  n,  =  X(X'X)'1X'Y, 


Denote 

/\  /N 

Z ,  =  (Y,  X, ) 

A  A  A  A 

4.  6  ,  =  (Z  ,  ' Z , ) -  1 Z , ’y , 

5.  Iterate  steps  2,  3,  and  4  until  convergence. 
Calculation  of  the  variance-covariance  matrix 

A  .  *  »  A 

6.  u,  =  y ,  -  (Y,  X, )6 , 

A  A  A 

a,,  =  Xu , ' u , / (T-K) 

A  ^  a  A 

Var  (  5  ,  )  =  aMZi'Z,)-1 


, 
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V 

THEIL 1 

[ 

1] 

£?•*-  3  0  p  1 

[ 

2] 

ATC0EF1+  0 

7  pO 

r 

L 

3] 

ATBAD  1^-Op  0 

[ 

4] 

AVTHEIL+  0 

5  p  0 

[ 

5] 

11+ 1 

[  6]  LL1 : Yl+YDATA [ ( 30x (II- i ) ) +1 30 ; 1 ] 

[  7]  Y2+YDATA [ ( 30* (II- 1 ) )  +  i  30  ;  2] 

C  8]  Y  3+Y  D  AT  A\_ ( 3  Ox ( JJ- 1 )  )  +  1  3  0  ;  3  ] 

[  9]  Y2F-Z+  .  xY2EEZ 

[10]  Y 3H+Z  +  .  * Y 3EEZ 

[11]  51^YlEI3(  Y2F, Y3F,Y1 ) 

[12]  Rl+I^O 

[13]  L1:R11+R1 

[14]  Rl*-((UE)  +  .x-l±E)++/(l±~  1  +£«-Yl-  (Y2,Y3,Yl)  +  .x5i)*2 

[15]  AT  B  AD  1+ AT  B  ADI  ,  1 1*  \  (  R1>1  ) 

[16]  Rl+(R1 , 0 . 99 ) [ 1+R1>1 ] 

[17]  FF1^(1-F1*2)*0.5 

[18]  ZR +  1  7  pZ[l;]x/?/?i 

[19]  ZR+ZR,  [  1  ]  (  1  0  4-  Z  )  -  7?1  x  “1  0  4Z 

[20]  XI R+  1  3  pll  [1  ;  ]*RR1 

[21]  X 1 R+X 1 7? ,  [  1  ]  ( 1  0  AX1)-R1*  "1  0  111 

[22]  Y17M1  [  1  ]  xRRl 

[23]  Y1R+Y1R, ( 1+Y1 ) -Fix" 14Y1 

[24]  Y  2R+Y  2  [  1  ]  *RR1 

[25]  Y2R+Y2R, ( 14Y2 ) -Rl*~ltY2 

[26]  Y3R+Y3 [1 1*RR1 

[27]  Y3R+Y3R,  ( 14Y3  ) -7?lx"iiY3 

[28]  Y  2HH+ZR+  .  x  y  2  7?@Z7? 

[29]  Y 3HH+ZR+  .  xy 3  7?[±]Z7? 

[30]  £ 1 Y 1  FEE)  (  Y  2  HH  ,  Y  3  HH  ,X1R) 

[31]  ->Ylx  i  (  20  >  I«-I+l  )a(0.005<|/?1-J?11) 

[32]  ATCOEE 1+ATC0EF1  , [ 1 ] ( £1 , FI , I ) 

[33]  UH*-Y  IF-  (Y2R,Y3R,X1R)  +  .*B1 

[34]  S2^( +/UH*2 )+26 

[35]  7j4F«-52  x@(  <$>  (  Y  2HH ,  Y3HH  ,X1R)  )  +  .  *Y2HH,  Y3HH,  X1R 

[36]  AVTHEIL+AVTHEIL  ,  [  1  ]  FZF  [  1  ;  1  ]  ,  VAR  [  2  ;  2  ]  ,  VAR  [  3  ;  3  ]  ,  ZZF  [  4  ;  4 
] , VAR[ 5 ; 5] 

[37]  +LL1  x  \  (10  0  ^11 +11  + 1  ) 

[38]  'B/7D  FF  THE  RUN ' 
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6 . 

The i 1 : 

Instrumental 

Var i able 

Approach 

yi  = 

Y, 

B  i 

+  Xl7l  +  u,  =  Z ! 5 

t  +  u, 

A 

A 

1  .  Y 

i  = 

X 

n,  = 

X(XT  X) ■ 1 X ’ Y , 

Denote 

A 

A 

Zi 

= 

(Y, 

x,  ) 

A 

A 

A  A, 

z  ,  )  -  1  Z  1  '  y  1 

6, 

= 

(z,  ' 

A 

A 

2  .  u 

i  = 

y  i 

-  7 

LJ 

,5, 

A 

ta 

A 

,  rA 

-Wins ten ) 

r  = 

Lu 

1  '  u, 

,  -  i  /  Lu i  _ i 2 

( Pra i s 

X 

3 

3.  Transforming  the  first  structural  equation  using 
Prais-Winsten  transformation  matrix 
Pyi  =  PYiB,  +  PXl7l  +  Pu, 
or 


Yi  = 

Y  i  B  ! 

+  X  1  7  ,  + 

4.  Denote 

A 

A 

e* 

W  i  = 

( Y  i 

Xi  ) 

where 

A 

Yi  = 

X(X’ 

X) -'X'Y, 

• 

A 

A 

Y  i  =  P  Y ! 

A  A  A  , 

5.  6  !  =  (W/Zj-'W/y, 

6.  Iterate  steps  2,  3,  4,  and  5  until  convergence. 

Calculation  of  the  variance-covariance  matrix 
a  *  ;  £ 

7  .  u  i  -  y  i  z  i  o  i 

*  o 

A  A  .  A  .  f  v 

an  =  lu,  '  u1/(T-K) 

A  A  ,  A  *  ,  . 

Var ( 5  ,  )  =  a ! i ( W ! ' Z ! ) " 
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V  THEIL1IV 
[  1]  ATV1*-  0  7  pO 
[  2]  AVTV1+-  0  5  pO 
[  3]  ATVBAD l^OpO 
[  4]  II<-1 

[  5]  LLl:Yl4-YDATAl(30x(II-l))  +  \30;l] 

[  6]  Y  2<-Y  DAT  A  l  (30x(JJ-i)  )  +  i  3  0  ;  2  ] 

[  7]  Y3+YDATAI  (30x(U-i)  )  +  i30;3] 

[  8]  Y 2j¥*-Z  +  .  xy 2gZ 
C  9]  Y  3H+Z+ . xy  3@Z 
Cio]  bi^yi@( Y2B, Y3B,xi ) 

[11]  B^I^O 

[12]  I1:B1«-B 

[ 13 ]  B<-(  (  1  4£)  +  . x";u£)+  +  / ( 14~14£«-Y1- (Y2,Y3,X1)+.xB1)*2 

[14]  ATVBADl+ATVBADl ,11* \ (B>1 ) 

[15]  B<-(B,  0.  99)  [1+B>1] 

[16]  BB«-(  1  -  R*2  )  *0 . 5 

[17]  I IB-  1  3  pXl[l;]xBB 

[18]  X1B-X1B, [ 1 ] ( 1  0  4  X 1 )  -  B  x  “l  o  4X1 

[19]  Y1B-Y1[1]xBB 

[20]  Y1B-Y1B,  (14  Y1  ) -Bx-HYl 

[21]  Y2B-Y2[l]xBB 

[22]  Y2B-Y2B,  (14-Y2) -Bx-14Y2 

[23]  Y  3B-Y  3 [ 1 ] xBB 

[24]  Y3B-Y3B, ( 14Y3 ) -Bx" 14Y3 

[25]  Y  2HR+-Y  2B  [  1  ]  xBB 

[26]  Y2HR+Y2HR, ( 14Y2B) -Bx“14Y2B 

[27]  Y  3BB— Y  3B[ 1 ] xBB 

[28]  Y3HR+-Y3HR,  ( 1  4  Y  3  B )  -  Bx  ‘  1  4  Y  3  B 

[29]  f/— Y  2BB , Y  3  HR  ,  XI B 

[30]  M-Y2B, Y3B,X1B 

[31]  Bl^(@((W)  +  .xM))  +  .x((W)  +  .xYlB) 

[  32  ]  ->11  x  i  (  2  0  >1^7+1  )  a(0.005<|B-B1) 

[33]  X7Tl«-i47Tl  ,  [1]  (£1  ,B,I) 

[34]  fi  ***END  OF  THE  FIRST  EQUATION 

[35]  UH+Y IB- (Y2B,Y3B,X1B)+.xB1 

[36]  52>-(  +  /BB*2  )*26 

[37]  K>lB«-S2xfE(<$>f/)  +  .  xM 

[38]  vlVyKl-MyTTl  ,  [1]  Fi4B[l;l]  ,  V  j4  B  [  2  ;  2  ]  ,  F  4  B  [  3  ;  3  ]  ,  F  ,4  B  [  4  ;  4  ]  ,V  A 
B  [  5  ;  5  ] 

[39]  ->LIlx  i  (  100>II«-II  +  1  ) 

[40]  '****END  OF  THE  RUN ’ 


V 
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7.  Amemiya  LI ML  (ALIML) 

y  i  =  Y i B !  +  X  t  7 i  +  u,  =  Z  t  6  t  +  u, 

1.  Run  2SLS  on  the  first  structural  equation  and  use  the 
estimated  coefficients  to  calculate  the  autocorrelation 
coefficient  as 

A  A 

2.  u,  =  y i  ~  Z  i  6  ! 

-A  Ta  A  ,rA 

r  =  lu/u,,.,/  Iu1;.,2 
X  X 

3.  Write  the  structural  equation  under  consideration  as 

y  i  *B  i  *  =  X,7!  +  u i 

where 

y i *  =  (y i  y ! ) 

B ! *  =  ( 1  -B, '  )  ' 

4.  Denote 

V,  =  y  ,  * '  ( I - X i ( X ! 'Xi ) ' 1 X , '  ) y i * 
where  the  variables  are  transformed  by  CORC  matrix.  Denote 

y* i *  =  (y  i  y,  ) 

ip2  =  y  ,  *  ’  ( i-4>(4>' <i>)  '  14>'  )y  i  * 

where 

$  =  (X  Y_  i  X.  ,  ) 

X  =  ( X  -i  xx ) 
y  =  (y i *  yx ) 

where  Xx  and  yx  are  the  exogenous  and  endogenous  variables 
excluded  from  the  first  equation. 

5.  Find  X  as  the  minimum  root  of 

de  t  (  4s !  ~  X$2  )  =  0 

6.  Estimate  B,*  as  the  solution  to  the  following  equation 


0 


2  1  7 


A 


(*i  -  i  n*2 )  b2*  =  0 

A 

Use  B,*  to  estimate  7,  as 

7 1  =  (Xi,X1)_1'x1,y1*B1* 

7.  Iterate  steps  2,  3,  4,  5,  and  6  until  convergence. 
Calculation  of  the  variance-covariance  matrix  of  61 

A  _  _  _  /\ 

8  .  u,  =  y !  -  ( Y  t  X , ) 5 , 

a  a.  ± 

o  1  1  =  lu 1 ' u ! / (T-K ) 

A  A  d  A  _ 

Var (6 , )  =  a , , [ ( Y,  X, ) ' (Y,  X, ) ] " 1 

A 

Where  Yt  =  $($'  <£)  ~  14>'  Y  , 
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V  ALIML 

[  1]  AAC1  +  0  7  pO 
[  2]  AV AC  1*-  0  5  pO 
[  3]  AABAD+0 pO 
[  4]  11+ 1 

[  5]  LL1:Y1+YDATA[(30X(II-1))  +  \30;1^\ 

[  6]  Y  2  +  Y  DAT  A [ (30x(jj-i) )  +  i  3  0  ;  2  ] 

[  7]  Y  3+YDAT  A [ (30x(U-i) )  +  i  3  0  ;  3  ] 

[  8]  Y2H+Z+ .xY2gZ 
[  9]  Y2H+Z+  .*Y3m 

[10]  Bl^Ylg(Y2tf , Y3tf ,Y1 ) 

[11]  YL+  "1  0  +$(3  30  p(Yl,Y2,Y3)) 

[12]  YA+  1  0  1  §  (  3  3  0  p(Yl,Y2,Y3)) 

[13]  V+ ( 1  0  AX11 ,122 ,Y33 ,Y44 ,155 ,Y66 ) ,YI,  ’1  0  +XDATA,C 

[14]  R+I+O 

[15]  L1:R1+R 

[16]  R+(liE)\B~  1AE+Y1-  (Y2,Y3,yi)  +  .xBi 

[17]  AABAD+AABAD, II* \ (R>1 ) 

[18]  R+(R,  0  .  99  )  [  1+7?>1  ] 

[19]  X 1R+ ( 1  0  *X1)-Rx  ~1  0  111 

[20]  YlR+(  14Y1 ) -7?x"  14-Y1 

[21]  Y  2R+  (14-Y2)-/?x"i4.Y2 

[22]  Y37?^-(  liY3  ) -7?x"  14-Y3 

[23]  YY+§  (  3  29  p  (  Y 1 R  ,  Y  2R  ,  Y  3  7? )  ) 

[24]  J/l-6-  (  ( <$>YY  )  +  .  x  yy  )  -  (  (<$>YY  )  +  .  *X1R)  +  .  xyygYlT? 

[2  5  ]  W2+(  (§YA  )  +  .xYA)-(  (§YA  )  +  .xK)  +  .  xy^gy 

[26]  7>l+-5-MXFF  G«-(E(J/1  -1/2  )  )  +  .  xf/2 

[27]  WW+W1-L*W2 

[28]  BAM -J/J/[  ;  1]  )@  0  1  +WW 

[29]  Bl  +  (-BN)  t  (\B($X1R)  +  .*XIR)  +  ,x($X1R)  +  .*YY  +  .*1  ,BN 

[30]  +L 1  x  t.  (  20>  J^-J+l  )a(0.005<|7?-7?1) 

[31]  AAC1+AAC1 , [1] (51 ,R,I) 

[32]  UH+Y1R- (  Y27?,  Y3R,X1R)  +  .  x£l 

[33]  S2+{+/UH*2)±26 

[34]  Y 2HR+V  +  .  xy 27?@Y 

[35]  Y  3HR+V  +  .  xy 

[36]  VAR+S2*\B($(Y2HR,Y3HR,X1R)  )  +  .  x  y  2HR  ,  Y  3  HR  ,  X 1 R 

[37]  AVAC1+AVAC1  ,  [ll  VA  R  [  1  ;  1  ]  ,  VAR  [  2  ;  2  ]  ,  VAR  [  3  ;  3  ]  ,  VA  R  [  4  ;  4  ]  ,  VA 
Rl  5  ;  5] 

[38]  +LLlx i ( 100>II+II+1 ) 

[39]  '***END' 


V 


n 

I 
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8.  Generalized  LIML  (GLIML) 

y  1  =  +  X 1 7  t  +  u,  =  Z  -i  5  i  +  u, 

1.  Write  the  structural  equation  under  consideration  as 

y  *  B  i  *  =  X  ,  7  i  +  u. 

Where  y,*  and  B,*  are  defined  above. 

2.  For  given  r  transform  the  above  structural  equation, 

using  P.W  matrix,  to  the  one  free  of  autocorrelation 

*  * 

y  1  *  B  ,  *  =  X  1  7  !  +  e  , 

3 .  Denote 

*,*  =  y i * ' (I-X, (X, ’X, ) " 1X1 ' ) y,* 

$2*  =  y  t  * '  (I-X(X'X)_1X'  )y 
Calculate  X  as  the  minimum  root  of 
det(^!*  -  X$2*)  =  0 

4.  Estimate  B,*  and  7,  as  the  solutions  to  the  equations 

A 

($,*  -  X45  2  *  )  B  t  *  =  0 
7 1  =  (X/Xj-'X/y^B,* 

A  «  t  •  A 

5.  Ul  =  y,  -  (Y,  X1)S1 

6.  Calculate  steps  1,  2,  3,  4,  and  5  for  r  varying 
between  (-1,+1)  and  choose  the  estimated  coefficients  which 
yield  the  minimum  error  sums  of  squares  SE  where 

o  o 

A  ,  A 

SE  =  u 1  u 1 

A 

Calculation  of  the  variance-covariance  matrix  of  5,  is  as 

7.  a , n  =  SEmi n/(T-K) 

A  A  #  a 

Var (6 , )  =  a, , [ (Y,  X, ) ' (Y,  X, ) 3 " 1 

Where 

Y,  =  X(X’X) " 1 X  '  Y , 


' 
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V  GLIML 

[  1J  ABC2*-  0  6  pO 

[  2]  B  L2B  AD+-0  p  0 

[  3]  A  V  B  C  2 +-  0  5  pO 

[  4]  LL+RF+ER  2<-i0 

[  5]  II«-l 

[  6]  RT<-  ( -1  )  + (  il9)-M0 

[  7]  LL1  :Y  1+-Y  DAT  Al  (30^(11-1)  )  +  i  3  0  ;  1  ] 

[  8]  Y2+-YDATAI (30x(U-i ) )+i30;2] 

[  9]  Y  3*-Y  Di4  T  A  [_  (30X(JJ-1)  )  +  i  3  0  ;  3  ] 

[10]  I«-l 

[11]  ER2^\0 

[12]  LUR+RTII] 

[13]  X1R^(1  0  iXl)-Rx  ~1  0  ill 

[14]  Y1R<-(1±Y1)  -Rx-  1AY1 

[15]  Y  2R+-  (  1  AY  2)  -  R*~  1  ±Y  2 

[16]  y  3  R+-  ( 1  AY  3  )  -  R*~  1  ±Y  3 

[17]  YY^<$>(3  29  p  (Y1B,  Y2R,  Y3R)  ) 

[18]  XT<-  (1  0  AZ)-Rx  “1  0  iZ 

[19]  J/l<-(  ( §YY  )  +  .  x yy ) - (  (^>yy  )  +  .  xXIR)  +  .  *YY[BX1R 

[20]  W  2<-(  ($YY  )  +  .  xyy  )  -  (  (<$>yy  )  +  .  xXT)  +  .  xYYBXT 

[21]  L+1++MXEV  Q+(\B(W1-W2))  +  ,*W2 

[22]  WW+W1-LXW2 

[23]  BAM -J/J/[  ;  1  ]  )EE  0  1  UAW 

[24]  BN)  ,  ($£(lXXlR)  +  .xXlR)  +  .x(§XlR)  +  .xYY+.xl,BN 
[25  ]  ER2+ER2  ,  +  /  (ER+Y 1R  -  ( Y  2R  ,Y  3  R  ,  X1R)  +  .  x#  i  )  *2 

[26]  Mix t ( l 9>IM  +  1  ) 

[27]  Rl+(ER2el /ER2) / RT 

[28]  r  FINAL  ESTIMATION  OF  THE  FIRST  EQUATION 

[29]  X 17?-*- ( 1  0  4- X 1  )  Mix  '1  0  ill 

[30]  Y1BM  1  +  Y1  ) Mix’ i  +  yi 

[31]  Y2BM  14-Y2  ) -Rl*~liY2 

[32]  Y  3  R+-  (14-y3)Mlx“i4Y3 

[33]  Y  YM  (  3  2  9  p  (  Y 1 R  ,  Y  2R  ,  Y  3  R  )  ) 

[34]  XT*-  (1  0  +  Z)Mlx  "l  o  +  Z 

[35]  J/iM  (<XYY  )  +  .  xyy )  -  (  (<$>YY  )  +  .  xxib)  +  .  xyygxi/? 

[36]  W2<-  (  (<$>YY  )  +  .  xyy  )  -  (  (<$>YY)  +  .xX7)  +  .  xyygXY 

[37]  IM+MXBF  0MS(f/l-I/2)  )  + .xJ/2 

[38]  ¥V*-V1  -  L*W2 

[39]  BM  (-J/P/[;  1  ]  )@  0  1+1/1/ 

[40]  B lM  ~ BAO  ,  (EE  (  ^X  1B)+.xX1B)+.x(  §X1R )  +  .  *YY  +  .  *  1  ,  B  N 

[41]  ABC  2*- ABC  2  ,  [1]  ( B1  ,  7?1  ) 

[42]  UH+-Y 1R  -  (Y2B,Y3B,X1B)  +  .xB1 

[43]  S 2*-  (  +  / U H  *  2  )  +  2  6 

[44]  Y2HR+-XT+  .  *Y2R\BXT 

[45]  Y 3 HR-^X Y  +  .  x y 3 beex T 

[46]  FMM  2  x@  (  (  Y  2BB  ,  Y  3  BB  ,  X 1 B  )  )+.xY2BB,Y3BB,X1B 

[47]  4  VB(?2M  7B(72  ,  [  1  ]  VAR  [  1  ;  1  ]  ,  VAR  [  2  ;  2  ]  ,  VAR  [  3  ;  3  ]  ,  VAR  [  4  ;  4  ]  ,  VA 
Rl  5:5] 

[48]  MLlx  x  ( l  00M7MI  +  1  ) 

[49]  ’ *  *  *B ND ’ 
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B.  Systems  with  Lagged  Endogenous  Variable 
I.Theil:  Augmented  Reduced  Form 

y  b  +  y _ i r ,  +  x  r2  =  u 

1.  The  first  structural  equation  of  the  above  system  is 


y  i 

= 

Y  i  B 

1  +  Y  1  ,  -  1  7 

2.  w  = 

(Y 

-  i  Y 

-2  X  X. , ) 

Denote 

A 

A 

A 

W, 

= 

(Y, 

y  i  , - 1  x, ) 

A 

Y, 

= 

W(W 

' W) ~ 1 W ' Y , 

A 

' W) ' 'W' y , 

y  i 

= 

w(w 

Zi 

= 

(Yi 

y  i  ,  - 1  x, ) 

A 

A 

A 

«1 

= 

(w, 

' z, ) - 'W, ’y 

A  A 

and  y1(_,  is  a  one  period  lag  of  y,. 

A  A 

3.  u i  =  y  ,  -  Z , 6 , 

A 

r  =  CORC  formula 

4.  Transform  the  first  structural  equation  by  the  CORC 
method. 

y  i  =  y i b i  +  yi,-i7i  +  +  e , 

A 

5.  Obtain  Y ,  by  applying  OLS  to  the  transformed  augmented 
reduced  form 

_  A 

y i  =  zdn 

Where 

7d  =  (Y.  ,  Y.2  XL,) 

n  =  (zd '  zd )  - 1  z~d  ’  y, 


6.  Denote 


i  ,1,  v  <  a  t 


222 


/\  A  A  A 

5  ,  =  (Z  ,  '  Z,  )  "  1  Z  !  *  y", 

7.  Iterate  steps  2,  3,  4,  5,  and  6  until  convergence. 

8.  Calculation  of  variance-covariance  matrix  is  as  before. 


ii  9 
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V  THEILA 

[  l]  AT  C  A*  0  8  pO 

[  2  ]  AVT A+-  0  6  pO 

[  3]  TABAD+OpQ 

[  4]  II*- 1 

[  5]  II :Yl*YDATAl(31x(II-l) )+i31;l] 

[  6]  Y  2*-YDAT  A  [(31x(IJ-l)  )  +  i  3  1  ;  2  ] 

[  7]  Y3+-YDAT  Al  (31*-  (I  I  -  1)  )  +  i  3  1  ;  3  ] 

C  8]  Yll^llYl 

[  9]  Y2  1*<-1 4- Y2 

[10]  Y  3  1>-1 4- Y3 

[11]  YLl^'llYl 

[12]  Y  L2<-~  1 4-Y2 

[13]  XI<-YL 1,  1  0  *X1 

[14]  W<r(  “11Y11 )  ,  (  "11Y21 )  ,  (  '11Y31  )  ,  (  "11YI1  )  ,  (  "14-YL2  )  ,  (  2  0  4Z 
),  1  04--1  0  4'0"14'Z 

[15]  Y1H+-W+  .  x  ( 14-Y11  MW 

[16]  Y2B^W+  .  x  ( 14Y21  )fp/ 

[17]  Y3H+-W+  .  x  ( 1  +  Y31  )gj/ 

[18]  WH<-(liY2H)  ,  (14-Y3B)  ,  (  "l+Yltf)  ,  3  0  4-Xl 

[19]  M+-  (24-Y21)  ,  (24-Y31)  ,  (  2  4-  Y 1 1  )  ,  3  0  1Y1 

[20]  B«-(EE(  (W)+.xM)  )  +  .  x  (WH)  +  .  X21Y11 

[21]  R+I*0 

[22]  L2:R1*-R 

[23  ]  R*-(  1+B)S"  lll^Yll  -  (Y21,Y31,II)  +  .x5 

[24]  TABAD<-TABAD,II*\  (R>1  ) 

[25]  /?«-(/?,  0  .  99  )  [  1+B>1  ] 

[26]  YlR<-(  1  +  Y11  ) -Bx-liYll 

[27]  Y  2  7?>-  (14'Y21)-7?x“14-Y21 

[28]  Y  3  R*-  (14-Y31)  -  7?x“14-Y31 

[29]  XIR<-(  1  0  4XI)-Bx  “1  0  4 XI 

[30]  WR*(  1  0  *W)-Rx  "1  0  4 -W 

[31]  Y2HR+WR+  .  x  ( 1\Y2R  )BJ/B 

[32]  Y3tf7?>-f/I  +  .  x  ( 1 1 Y  3  7? )  SM 

[33]  £«-(14'Ylfl)EEY2tffl,Y3tffl,  1  0  lYIT? 

[  34  ]  ->L  2  x  \  (  2  0  >I<-I  +  1  )  a(0.005<|B-B1) 

[35]  ATCA+ATCA  , [1] (B,R,I) 

[36]  URH+-Y 1R  -  (Y  2R  ,Y  3  R  ,  X I R)  +  .*B 

[37]  S2<-(  +  /Z/7?tf*2)*24 

[38]  V^S2xg(<$)(  Y2HR,  Y3HR,  1  0  IXIR)  )  +  .*Y2HRyY3HR,  1  0  iXIR 

[39]  AVTA+-AVTA  ,  [1  ]  V  [  1  ;  1  ]  ,  7  [  2  ;  2  ]  ,  V  [  3  ;  3  ]  ,  V  [  4  ;  4  ]  ,  V  [  5  ;  5  ]  ,  7  [  6  ;  6 

] 

[40]  ->Ilxi  (100>II«-II+1) 

[41]  ’ *  * *END * 
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2.  Theil:  Ordinary  Reduced  Form 

y  b  +  y . , r !  +xr2=u 

The  first  structural  equation  is 


y  i 

=  Y  1  B 

+ 

i 

1  .  w  = 

(Y.  !  Y 

-2  X  X. , ) 

Denote 

A 

A 

A 

W1 

=  ( Y  i 

y i - 1  xj 

Zi 

=  (Y, 

X 

i 

A 

A 

A 

=  (W! 

’ Z , ) *  1 W, 1 y , 

Where 

A 

Y  i 

=  W(W' W) " 1 W’ Y ! 

A 

y  i 

& 

5 

ii 

'  w )  -  1 W '  y  ! 

A 

A 

2  .  u !  = 

Yi  " 

Z  i  6  i 

-A 

r  =  Pra i s-Wi nsten  formula 

3.  Transform  the  first  structural  equation  by  Prais-Winsten , 
P,  matrix 

Py i  =  PYiB!  +  Py i  , - i T i  +  PX i 7  2  +  € i 
or 

«  *  f  t 

=  Y,B,  +  y,  ^,7,  +  X,72  +  f  i 

4.  Write  the  transformed  ordinary  reduced  form  as 

PY=PY_1n0+PXn1+V 

To  obtain  fitted  values  of  PY ,  we  have  to  estimate  the 
reduced  form  by  instrumental  variable  estimator,  using  IV 
for  Y _ i  such  as 

A  A 

y.  1  =  x. ,n 

Where 


' 
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A 


n  =  (x_  '  x_  -i )  ■ 1  x_  i  ’  Y.  i 


Denote 


A 


Wd  =  (P  Y. !  P  X) 

Z  =  (P  Y_ ,  P  X) 

n,*  =  (w.'zj-’w.'y 

and  obtain 

A  A  ,  A 

PY  i  =  Y !  =  z 
Denote 

Zi  =  ( y i  y i  ,  - 1  X! ) 

A  b  ^  0 

5 ,  =  (Z, ' Z, ) - 'Z, ’ y , 


6.  Iterate  steps  2,  3,  4,  and  5  until  convergence. 

7.  Calculation  of  the  variance-covariance  matrix 

<■  .  -  a 

u  i  =  y  i  -  Z  i  6  i 

A  A  A 

a,,  =  Xu, ' u  t / ( T-K ) 

A  A 


A 


Va  r  (  6  i  )  —  ct  i  i  (  Z  i  '  Z  i  ) 


-  i 


U  -ft*  <0  * 
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V  THEILO 

[  1]  TC  1  +  0  8  pO 
[  2]  VTC1  +  0  6  pO 
[  3]  TBAD+  OpO 
[  4]  11+ 1 

[  5]  LUY1+-YDATAI  (31*  UI-1)  )  +  \31\ll 
[  6]  Y2+YDATAl(31*(II-l) )  +  x  31  ;  2] 

[  7]  Y3^yHX7’i4[(31x(JJ-i))  +  l31;3] 

[  8]  Yll^llYl 
[  9]  Y21^1lY2 

[10]  Y  3  1+-1  4  Y  3 

[11]  YLl^'llYl 

[12]  YL2+~1±Y2 

[13]  XI+YL 1,  1  0  1X1 

[14]  Y L 1 H+  (  "  1  0  lZ)  +  .xyi i@n  o  1Z) 

[15]  YL2H+  ( “  1  0  4-Z)  +  .xYl2SCl  0  1Z) 

[16]  ^CHYll)  ,  ("14-Y21)  ,  C11Y31)  ,  CllYLl)  ,  C11YL2)  ,  (2  0  1Z 
),104-‘1010~14Z 

[17]  Y 1 H+V  +  .  x  ( 1 4-  y  1 1  )  gj/ 

[18]  Y  2H+W  +  .  x  ( 1 1  y  2  1  )EBY 

[19]  Y  3  H+W  +  .  x  ( 1 4.  y  3  i  ) 

[20]  WH+(1±Y2H)  ,  (14Y3tf)  ,  CllYltf)  ,  3  0  1X1 

[21]  A7«-(2lY21)  ,  (24-Y31)  ,  (24-YL1)  ,  3  0  1X1 

[22]  Zl«-  1  0  1Z 

[23]  (WH)  +  .xM)  )  +  .  x  (WH)  +  .  X21Y11 

[24]  R+I+ 0 

[25]  L2:R1+R 

[26]  R+  (  ( 1  If)  +  .  x  "  l  j-Z)  t+  /  ( 1 1  _  1  l£^Yl 1  -  (Y21,Y31,YI)  +  .xB)*2 

[27]  TBAD+TBAD, II* i (7?>1 ) 

[28]  /?<-(/?,  0  .  99  )  [1+7?>1  ] 

[29]  7?7?+-  ( 1  - 7?* 2  )  * 0  .  5 

[30]  r  ***TRANSFORMATION  OF  THE  VARIABLES 

[31]  YlR+(  Yll  [  1  ]x/?j?)  ,  ( 11Y11  )  -Rx~  HYll 

[32]  Y  2R+  (  Y  2 1  [  1  ]  xRR )  ,  (l±Y21)-Rx~liY21 

[33]  Y3  7?^(Y31[1]x7?7?),(11Y31)-7?x-14.Y31 

[34]  ZR+  1  7  pZl[l;]x7?7? 

[35]  ZR+ZR  ,  [  1  ]  ( 1  0  1 Z  1  )  -  7?x  "l  0  1Z1 

[36]  XIR+  1  4  pXI [ 1 ; ]  x-RR 

[37]  XIR+XIR,  [  1  ]  ( 1  0  lXI)-7?x  "i  o  1X1 

[38]  YL 1 R+  (YLl[l]x7?7?)  ,  (liYLl)-Rx~HYLl 

[39]  Y L2R+  (YL2[l]x7?7?)  ,  (llYL2)-7?x~HYL2 

[40]  YIltf7?«-(YIltf[l]x7?7?)  ,(l  +  YLlJy)-7?x-i4-YIl^ 

[41]  Y  L2HR+  (  YL  277  [  1  ]  x7?7? )  ,  (14-YL2/7)  -  R*~  1  +  Y  L2H 

[42]  WR+(YL1HR,YL2HR,ZR) 

[43]  M/?«-(YL17?,  YL2R,ZR) 

[44]  Y2RH+MR+  .  x  (f§(  ( W7? )  +  .  x^T? )  )  +  .  x  (  WR )  +  .  x  y  27? 

[4  5]  Y3RH+MR+  .  x  (@(  (WR)  +  .*MR)  )  +  .  x  ( <^>j/7? )  +  .  xy  3  7? 

[46]  B^Y17?@(  Y27?7/,  Y3RH,XIR) 

[47  ]  -*L2  x  \  (  2  0  >1+1  +  1  )a(0.005<|7?-7?1) 

[48]  YC’l^YC’l  ,  [1]  (B,R,  I) 

[49]  URH+Y1R- (Y2R,Y3R,XIR)+ .xfi 

[50]  S2+(+/URH*2)-r2H 

[51]  lM^S2xg(<$>(Y27?tf ,  Y3RH,XIR)  )  +  .  *Y  2RH  ,Y  3RH  ,XIR 

[52]  VTC1+VTC1 Ki4[l;l]t7i4[2;2],7i4[3;3],V’/[4;4],7i4[5;5], 


IS 


VA [ 6  ;  6] 

[53]  +L1*\ (100>Il<-ll  +  i) 

[54]  ’ ***END ' 

V 
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3.  Fair  Estimator 

y  i  =  f  i  B  !  X  i  ^  'y  i  +  u  i  =  Z  i  6  i  u  i 

Where 

X  !  *  =  ( y  i  (  _  i  X  i  ) 

1 .  W  =  (X,*  X! , . !*  y t ( . ,  Y, , . i ) 

Note  that  some  of  the  lagged  endogenous  variables  are 
included  in  X, ,-i*,  yi,-i  and  Y1(_i.  However,  it  is  clear 
that  in  the  matrix  W  we  will  count  them  only  once.  Denote 


A 

A  /\ 

W, 

= 

( Y  i  y i , - ,  X, ) 

Zi 

= 

( Y i  y  i , - 1  x, ) 

A 

A  A 

«1 

= 

(W^Zj-'W/y 

A 

Y1 

= 

W  ( W  ’  W )  ■  1 W '  Y  ! 

A 

y  i 

— 

W(W'W) " ’W'y ! 

A 

A 

y 

i  _  Z  i  8  i 

r  =  CORC  formula 


3.  Transforming  the  structural  equation 

_  A 

y 1  =y 1  -  ry  1  , -  ! 

-  A 

X 1 *  =  X i ^  rX i  , -  i  * 

A  A 

Y ,  =  Y i  -  rY1;-i 
Denote 

A_  A  _ . 

Z  i  =  (Y,  X,*) 

a  0.  A  A  _ 

4.  8 !  =  (Zi,Zi)“1ZiTy1 

5.  Iterate  steps  2,  3,  and  4  until  convergence. 
Calculation  of  the  variance-covariance  matrix 

A.  _  _ .  A 

u  i  =  y  i  ~  Z  i  8  i 

=  Lut ’ Ut/IT-K) 
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Var (8 i ) 


*  A 

anfZ/Z,)'1 
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V  FAIR 

[  1]  AFC+  0  8  pO 
[  2]  AVFC^  0  6  pO 
[  3]  FBAD+ OpO 
[  4]  ll<r  1 

C  5]  II :Y1+YDATA[ (31x (77-1 ) )+i31 ; 1] 

[  6]  Y  2*-Y  DAT  A[  (31x(II-l)  )  +  i  3  1  ;  2  ] 

[  7]  Y  3«-YZM7  i4[(31x(II-l))  +  i31;3] 

[  8]  Y11«-14Y1 

[  9]  Y  2  1«-1  4  Y2 

[10]  Y 3  l-«-l  4  Y3 

[11]  Y  L<-~  1  4- Y1 

[12]  XI*-Y  L ,  1  0  4X1 

[13]  77+-  "1  0  4Y11  ,Y21 , Y31 ,  X7 

[14]  V^IV,  2  0  4X11  ,  X4  4 

[15]  Y 1  H<-V  +  .  x  ( 1 4  Y 1 1  )  @7 

[16]  Y2H+V+  .  x  ( 14-Y21  )S7 

[17]  Y3F+-7  +  .  x  (  14Y31  )@7 

[18]  !/«-(  14Y2F)  ,  ( 14Y3F)  ,  (  "14-Yltf)  ,  3  0  4X1 

[19]  M+-  (  2  4  Y  2 1  )  ,  (  2  4  Y  3  1 )  ,  (  2  4  Y I )  ,  3  0  4- XI 

[20]  5«-(@((W)  +  .xM))  +  .x($>J/)  +  .x24Y11 

[21]  R*-I«-  0 

[22]  L2:F1^F 

[23]  ^(14I)@‘14^Y11-  (Y21,Y31,X7)  +  .xS 

[24]  FBy}I^FBy4I,I7xx  (/?>i  ) 

[25]  F*-(F,  0  .  99)  [1+F>1  ] 

[26]  XIR+-  (  1  0  4X7)  -  Fx  “1  0  4X7 

[27]  Y 1R+-  ( 1  4  Y 1 1  )  -  /?x“14Y11 

[28]  Y  2R+-  (14Y21)-Fx“14Y21 

[29]  Y3F^( 14Y31 ) -Fx‘ 14Y31 

[30]  Y2HR+Y2H-R* " 14Y21 

[31]  Y  ZHR^Y 3F-Fx'14Y31 

[32]  £«-Y  1  FtS  (Y  2HR  ,Y  3  HR  ,  X I R) 

[33]  ->I2x!  (  2  0  >7^7+1  )  a  (  0  .  0  0  5<  I  R-Rl  ) 

[34]  i4FC,«-i4FC,»  [1]  (B,F,7) 

[35]  URH<-Y  IF  -  (Y2F,Y3F,X7F)+.xB 

[36]  S  2*-  (  +  /  U  RH  *  2  )  *2  4 

[37]  l/i4^52x@(<^(Y  2FF  ,  Y  3  FF  ,  X7F  )  ) + . xy 2FF , Y 3FF , X7F 

[38]  /4  FFC  ,  [  1  ]  7/ [  1  ;  1  ]  ,  771  [  2  ;  2  ]  ,  VA  [  3  ;  3  ]  ,  VA  [  4  ;  4  ]  ,  7/ [  5  ;  5  ]  , 

VA [ 6 ; 6  ] 

[39]  -»Llxi  (  100>77+-77  +  l  ) 

[40]  ’ *  * *  END ’ 
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4.  Dhrymes  Estimator  ( C2SLSA ) 

Y  i  =  Y  i  B  i  +  y  ,  '  _  i  7  !  +  Xt72  +  U]  =  Z  i  6  -j  +  U| 

1.  Estimate  the  structural  equation  under  consideration  by 
an  instrumental  variable  estimator,  as  in  the  The i 1 ( ordinary 
reduced  form),  and  calculate  a  consistent  estimate  of  the 
autocorrelation  coefficient  r. 

2.  Obtain  fitted  values  of  Y ,  by  application  of  OLS  to  the 
augmented  reduced  form 

A  A  A  A  A 

y  i  =  y  _  i  n  i  y  _  2  n  2  x  n  3  x  _ ,  n  4 

3.  Transform  the  first  equation  as 

yrry,  =  [(Yr’rY,  (y  i  ,  -  i -ry ,  , .  2 )  (XrrX,,.,) 

]  5  1  +  €  , 

4.  Denote 

Z,  =  [(YrrY1(.,)  (y  i  ( _  rry  i  (  _  2 )  (XrrX,,.,)] 

A  A  0-  _ 

6,  =  (Z1'Z,)-1Z1,y1 

A 

5.  Use  6,  to  obtain  new  estimate  of  r  and  iterate  steps  3 
and  4  until  convergence. 

Calculation  of  the  variance-covariance  matrix 

A_  _  _  A 

u ,  =  y  1  -  Z  ,  6  , 

A  A  A 

o,,  =  Xu, ' u i / ( T-K ) 

A  A  A 

Var (6  ,  )  =  a ,  ,  (Z,  ' Z , ) -  1 


' 
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V  DHRYMIT 

[  1]  AD2S+-  0  8  pO 
[  2]  A  VD2S+-  0  6  pO 
[  3]  DB2S+-QpO 
[  4]  II*- 1 

[  5]  LliYl+YDATAl  (31x(U-i)  )  +  x  3 1 ;  1  ] 

[  6]  Y2«-YZM7’j4[  (31x  (JI-1)  )  +  i  3  1  ;  2  ] 

[  7]  Y3*-Y£Mm(31x(II-i))  +  131;3] 

[  8]  Y 1 1^1 1 Y 1 
[  9]  Y  2  1«-1 1 Y2 

[10]  Y  3  1  «-l  1 Y  3 

[11]  YLl^'llYl 

[12]  YL2^‘llY2 

[13]  XI+YL1,  1  0  1X1 

[14]  W+-  (  “  1 1 Y 1 1  )  ,  (  ~  1 1 Y  2 1  )  ,  (“14-Y31)  ,  (  ~  1 1 Y  L 1 )  ,  (  _  1 1 Y  L  2  )  ,  (2  0  1Z 

)  ,  1  0  4-  10  10  1  1 Z 

[15]  Y1H+W+  .  x  ( 14Y11  )EB7 

[16]  Y2H+W+  .  x  (HY21  )EBJ7 

[17]  Y  3  H+-W+  .  x  (  1 4.  y  3  i  )  gj/ 

[18]  WW+-  ( 1 1 Y  2  # )  ,  (14-Y370  ,  (  ~14-YlAr)  ,  3  0  1X1 

[19]  ZR*-  ( 2 1 Y2 1 )  ,  (  2  4'  Y  3  1  )  ,  ( 2 1 YI1 )  ,  3  0  1X1 

[20]  B^(@(  (WfV)  +  .  *ZR)  )  +  .  x  (  §WW  )  +  .x24-Y11 

[21]  R+I*-  0 

[22]  L2-.R1+R 

[23 ]  R+i Ilf )0ET ll£«-Yll - (Y21,Y31,YJ)+.xfl 

[24]  DB2S+-DB2S,  IJx  i  (j?>l  ) 

[25]  R<r(R,  0  .  99  )  [1+7?>1  ] 

[26]  a  ****TRANSFORMATION  OF  THE  VARIABLES 

[27]  X I R+-  (  1  0  lll)-7?x  "i  o  III 

[28]  Y 1 (llYll)-i?x_liYll 

[29]  Y2/?«-(llY21)-flx~liY21 

[30]  Y  3  /?*-  (llY31)-i?x_i4-Y31 

[31]  Y2HR+-Y2H-R*'  l\Y21 

[32]  Y  3  HR*-Y  3H-Rx~HY31 

[33]  B^Y1R\£(Y2HR,  Y3HR,XIR) 

[  34  ]  ^L2xi  (  2  0  >1-1  +  1  )a(0.005<|/?-/?1) 

[35]  AD2S<-AD2S, [1] (B,R, I) 

[36]  URH<-Y  1R  -  (Y2R,Y3R,XIR)+.*B 

[37]  S2+(+ /URH*2)±2A 

[38]  7«-52  x[§(  <$>  (  Y  2HR  ,  Y  3  HR  ,XIR)  )  +  .  xy  2H  R  ,  Y  3  H  R  ,  X I R 

[39]  AVD2S<-AVD2S , [1]  7 [ 1 ; 1 ] , 7 [ 2 ; 2 ] , 7 [ 3 ; 3 ] , 7 [ 4 ; 4 ] , 7 [ 5 ; 5 ] , 7 [ 6 
;  6  ] 

[40]  +Llxi  (100>II<-II  +  1) 

[41]  »  * **END * 
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5.  Dhrymes  Two  Step  Instrumental  Variable  Estimator 

y  b  +  y_ 1r1  +  x  r2  =  u 

The  first  equation  can  be  written  as 

y,  =  Y,B,  +  yi,-i7i  +  X,72  +  u,  =  +  u, 

The  ordinary  reduced  form  of  the  above  system  is 

Yt  =  y.^b-1  +  x  r2B_1  +  ub'1  =  y.  , n !  +  x  n2  +  v 

1.  Estimate  all  structural  equations  by  IV  and  use  the 
estimated  coefficients  to  form  a  consistent  estimate  of  n t 
and  n2 .  Then  obtain  the  fitted  values  of  Y :  using  Y.1(0)=0 
as  the  initial  condition. 

AAA 

y,  =  Y.,n,  +  x  n2 

2.  Use  the  initial  IV  estimates  of  the  structural  equation 

A 

under  consideration  to  calculate  r. 

A  * 

u  i  =  yi  -  Z ! 5  ! 

A 

r  =  Pra i s-Winsten  formula 

3.  Define 


A 

W, 

=  (PY, 

Py  i , 

-  1  PX , ) 

z  1 

=  (PY, 

py  i , 

- 1  PX  ,  ) 

and  estimate 

A  A  $  A  o 

6,  =  (W/Zj-’W/y, 

Calculation  of  variance-covariance  matrix 

A  .A 

4.  u !  =  y !  -  Z  t  5  , 

o,  1  =  Zu,  ’Ui/(T-K) 

Var (6  1  )  =  a,  i (W, ’ Z, ) ' 1 


' 
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V  DHRYM2S 
[  1  ]  DC  1+  0  8  p  0 
[  2  ]  DC 2  +  0  7  p  0 
[  3]  VDC2  +  0  6  pO 
[  4]  VDCI +  0  6  pO 
[  5]  YHl+YH2^YH2*-22pO 
[  6]  DBAD+ OpO 
[  7]  11+ 1 

[  8]  L1:Y1+YDATA[(21*(II-1) )+i31;l] 

[  9]  Y  2+Y  DAT  A[ (21*(II-1) )  +  \  3  1 ;  2  ] 

[10]  Y2+YDATAl(21x(II-l))  +  x21‘,2'] 

[11]  Y11<-14Y1 

[12]  Y21^14Y2 

[13]  Y31«-14Y3 

[14]  YL1^'14Y1 

[15]  YL2+~1±Y2 

[16]  XI+YL 1,  1  0  4X1 

[17]  XI 1  +  1  0  411 

[18]  W+  (  ~  1  4  Y  1 1  )  ,  (  ”  1  4  Y  2  1  )  ,  (  ~  1  4  Y  3  1  )  ,  (  ~  1  4  Y  L 1  )  ,  (  ”  1  4  Y  L  2  )  ,  (2  0  4Z 

),  1  0  4  “ 1  0  4  0  ”1  4Z 

[19]  Y1H+W+ . x (14Y11  )@J/ 

[20]  Y2H+W+ . x ( 14Y21 

[21]  Y2H+W+  .  x  (  1+Y31  )@J/ 

[22]  Wl+(liY2H)  ,  ( 1  4  Y  3  B )  ,  (~14Y1B)  ,  3  0  4X1 

[23]  W2+(1±Y1H)  ,  C14Y2B)  ,  3  0  4X2 

[24]  Z1<-(24Y21  )  ,  (24Y31  )  ,  (24YZ1  )  ,  3  0  4X1 

[25]  Z2«-(24Y11)  ,  (24YI2)  ,  3  0  4X2 

[26]  B  1<-(E(  (WD  +  .xZl)  )  +  .  x  (  1)  +  .x24Y11 

[27  ]  B  2«-(@(  ( <X>V2  )  +  .xZ2))  +  .x  (W2  )  +  .  x2  4  Y21 

[28]  B2  +  (  14Y31  )@(Y2tf,  2  0  4X3  ) 

[29]  BB+  2  2  pl,(-Bl[l]),(-Bl[2]),(-B2[l]),l,0,0,(-B3[l]),l 

[30]  Q1  +  1  7  pBl[4],0,0,Bl[5],0,0,Bl[6] 

[31]  Q 2+  1  7  p  0  ,  B  2  [  3  ]  ,  0  ,  B  2  [  4  ] ,  0  ,  B  2  [  5  ] ,  B  2  [  6  ] 

[32]  Q2  +  1  7  pO,B3[2] , B  3 [ 3 ] ,  0  ,  B  3  [  4  ] ,  0  ,  B  3  [  5  ] 

[33]  Q+Q 1 , [1]  Q 2, [1]  Q 2 

[34]  P+  2  2  pBl[3],0,0,0,B2[2],0,0,0,0 

[35]  PP1+ (EEjBB )  +  .  xp 

[36]  PP2  + ( ®BB )  +  . x  # 

[37]  1+2 

[38]  YBl^YB2^YB3^32pO 

[39]  L  2  :  YB1  [I]«-PP1  [  1  ;  ]  +  .  x  (  YH1  [  I  -  1  ]  ,  YH2LI-1  ]  ,  YB3  [I-  1  ]  )  +PP2 [1 ; 

] + . x$z [I- 1 ; ] 

[40 ]  YB2 [ I ] +PP1 [ 2 ; ]+ . x(YHl LI- 1 ] , YB2 [ I -  1 ] , YB3  Ll~l ] )+PP2 [2 ; ]  + 

.  x$z [I- 1 ; ] 

[41 ]  YB3 [I]<-PPi [3 ;  ]  +  . x ( YH1 LI-1 ] ,YB2{ J-l ] , YB3 [1-1 ] ) +PP 2 [3 ; ]+ 
.x<$>z[J-l  ;  ] 

[42]  +L 2  x \ (32  >1+1 + 1 ) 

[43]  /?<-(  (  14B)  +  tx’l4f)^+/  (  14_14£«-Y11-  (Y21,Y31,XI)  +  .xS1)*2 

[44]  RR+ ( 1  - B* 2 ) *  0 . 5 

[45]  Y  HL  +  1 4  ~ 1 4  YB1 

[46]  YB1^24YB1 

[47]  YH2+2\YH2 

[48]  YH2  +  2 4  YB3 

[49]  YHLR+L YHLLl ]*RR) , ( 1 4 YHL ) - R* ' 1 4 Y HL 
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[50]  YH2R+(YH2lllxRR) , ( 1 4 YH2 ) - R* " 1 4 Ytf2 

[51]  YH3R+(YH3  [1  ]xflfl)  ,  (  UY  HZ)  -  R*~  ltY  HZ 

[52]  YlR*-(Ylllll\xRR)  ,  ( 1  +  y  1 1 )  -fix-14-Yll 

[53]  Y  2R+-  (  Y  2  1  [  1  ]  xRR )  ,  (  1  4  Y  2  1  )  -7?x“14Y21 

[54]  Y  3R^  (Y31[l]xi?/?)  ,  (14Y31)--Rx“14Y31 

[55]  XI R<-  1  3  pill  [1  ;  ]*RR 

[56]  XIR^XIR,  [  1  ]  (  1  0  4Y71)-/?x  ~l  0  4111 

[57]  XIR +-  1  4  pY7[l;]x/?fl 

[58]  XIR+XIR, [ 1 ] ( 1  0  4 XI)-Rx  "1  0  4X7 

[59]  YI17?«-(YIl[l]x/?J?)  ,  (14YL1  ) -/?x-14YL1 

[60]  WW+YH2R, YH3R, YHLR, XIR 

[61]  ZR*-Y2R,YZR,YL1R,X1R 

[62]  B«-(B(  (WW)  +  .*ZR)  )  +  .  x  (W)  +  .  xYll 

[63]  DC2+DC2 , [1 ]  B , R 

[64]  URH^Y 1R- (Y2R,Y3R,XIR)+.*B 

[65]  S2^(  +  /URH*2  )  *24 

[66]  VA+S2xE($(YH2R,YH3R, YHLR,X1R) )  +  . *Y 2R ,Y 3R ,Y L1R , XIR 

[67]  VDC2+VDC2 , [ 1 ]  VA [ 1 ; 1 ] , VA [ 2 ; 2 ] ,  VA  [ 3 ; 3 ] , VA [ 4 ; 4 ] , VA [ 5 ; 5 ] , 

m6;6] 

[68]  7«-l 

[69]  L3-.R1+R 

[  70  ]  R<-(  ( 14J?)  +  .  x"l  4£)+  +  /  ( 14"  14£«-Y11-  (Y21,Y31,X7)+.x£)*2 

[71]  DBAD+DBAD, II*\ (R>1 ) 

[72]  R<-(R,  0  .  99)  [1+/?>1] 

[73]  RR+-  (l-/?*2)*0.5 

[74]  YHLR*-(YHLlH\*RR) , ( 1 4 Y HL ) - R* " 1 4 Y #7 

[75]  YH2R+-  (Y  H2ll]xRR)  ,  ( 1  4  Ytf2  )  -  J?x  ‘  1  4  YH2 

[76]  YH3R+-  (  Y  j¥  3  [  1  ]  *RR  )  ,  (14Ytf3  )  -Rx~l±YH2 

[77]  Y  17?<-  (  Y 1 1  [  1  ]  xRR )  ,  (14Y11)-7?x"14Y11 

[78]  Y  2  7?-<-  (Y21[l]x7?/?)  ,  (14Y21)-7?x“14Y21 

[79]  Y3  7?*-(Y31[1]x7?7?),(14Y31)-7?x-14Y31 

[80]  Yl/?«-  1  3  pX  I  111  \  1*RR 

[81]  X 1  R^X  1 7? ,  [  1  ]  (  1  0  4X71  )  -  Rx  "1  0  4X71 

[82]  XIR+  1  4  p I 7 [ 1 ; ] *RR 

[83]  X7/?«-X7/?,  [1  ]  (1  0  4 X7)-7?x  ~i  o  4  XI 

[84]  YL1R+IYL1  lll^RR)  ,  (14YL1  )  -/?x"i+YLi 

[85]  WW^YH2R, YH3R, YHLR, XIR 

[86]  ZR*-Y2R,Y3R, YL1R, XIR 

[87]  (WW)  +  .  *ZR)  )  +  .  x  (WW)  +  .  xyii 

[88  ]  -^7  3  x  i  (  2  0  >7^-7+ 1  )a(o.005<|B-B1) 

[89]  DCI+DCI , [ 1 ]  B,RtI 

[90]  URH+Y IB- (Y2R,Y3R,XIR)+.*B 

[91]  S2<-(  +/URH*2  )*24 

[92]  V+-S2  xg(  <$>  (Y  H2R  ,Y  H3R  ,Y  HLR  yXlR)  )  +  .  *Y 2R ,Y 3 R ,Y L1R , X 1R 

[93]  VDCI+VDCI,  [1  ]  V[1;1],F[2;2],7[3;3],F[4;4],F[5;5],V'[6;6 

] 

[94]  ->Llx  i  (  lOO>77^-77+l  ) 

[95]  ’ ***END ’ 
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6.  Hatanaka(A)  Estimator 

y  1  =  Y  !  B  ,  +  y  1  (  -  1  7  1  +  Xt72  +  u !  =  Z  !  6  !  +  U 1 

1.  Applying  OLS  to  the  augmented  reduced  form  and  obtain  a 
fitted  values  of  the  Y,  and  y. 

Denote 


w,  =  (  y  i  y  ^ ~  ^  xj 


Z i  =  (Y,  Yl , . ,  X, ) 

2.  Apply  IV  to  the  structural  equation  under  consideration 

.  ,  a  A 

and  obtain  fitted  values  of  u,  and  thus  the  r 


A 


6  w  =  (W/Zj-'W/y, 


A  /> 

3.  u,  =  y  i  -  Z 1 6  ,v 

r  =  CORC  formula 


4.  Denote 


A 


y i  =  y i  -  ry^,-^ 


and  finally 


Calculation  of  the  variance-covariance 


6 .  u,  =  y i  -  Z,S, 


a,,  =  Lut  ’ut/CT-K) 
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V  HATANAKAA 
[  1]  AHA  +  0  7  pO 
[  2]  AVHA  +  0  7  pO 
[  3]  11+ 1 

[  4]  LI :Yl+YDATAl (31*(II-1) )+i31 ;1] 

[  5]  Y  2+Y  DAT  A[ ( 3 1  * ( II -  1 ) )  +  i  3  1  ;  2  ] 

[  6]  Y3+YDATAL ( 3 1 x ( JJ-1 ) )  +  i31 ; 3] 

[  7]  Yll«-liYl 
[  8]  Y  21  +  1 4-Y  2 
[  9]  Y  3  1  ^  1  i  Y  3 

[10]  YLl+'liYl 

[11]  YL2+~1AY2 

[12]  XI+YL 1,  1  0  ill 

[13]  V+  (“14-Y11)  ,  ( “ 1 IY  2 1  )  ,  (  “  1  i  Y  3  1 )  ,  (“14'YIl)  ,  (  "  1 i YL2 )  ,  (2  0  iZ 
),104-'1040‘liZ 

[14]  Y  1 H+W  +  .  x  ( 1 4-  Y 1 1  )  \£W 

[15]  Y2H+W+  .  x  ( 1  +  Y21  )Sl/ 

[16]  Y  3  H+V  +  .  x  (  1 4.  y  3  i  ) 

[17]  W1  +  (1±Y2H)  ,  (HY3H)  ,  CliYltf)  ,  3  0  4-X1 

[18]  M+  (  2  i  Y  2  1 )  ,  (24-Y31)  ,  (  2  4  Y  L 1  )  ,  3  0  ill 

[19]  B+(  @(  (Wl  )+.xM)  )+.x(Wl)  +  .x2  +  Yll 

[20]  UH+Y 11- ( Y  2 1 ,  Y  3  1 ,XI)  +  ,*B 

[21]  R+(1UJHM~U'UH 

[22]  Y  2HR+Y 2H - R*~ 1 ±Y  21 

[23]  Y3HR+Y3H-R*- 14Y31 

[24]  XIR+(  1  0  4YI)-7?x  "1  o  iXI 

[25]  Y 1 R+  (l4'Yll)-i?x~14-Yll 

[26]  Y  2  7?*-  (14-Y21)-7?x“liY21 

[27]  Y37?^(  14Y31  ) -7?x“14Y31 

[28]  WH+Y2HR,  Y3HR,XIR,  ~HUH 

[29]  BR+Y1R^WH 

[30]  AHA+AHA , [1 ]  BR+ ( 0 , 0 , 0 , 0 , 0 , 0 , R ) 

[31]  URH+Y1R- (Y2R, Y3R,XIR) + . *~1+BR 

[32]  S2+(+ /URH*2 )*2  4 

[33]  V+S2*iB(WH)+  ,*WH 

[34]  AVHA+AVHA , [1]  F[1;1],F[2;2],7[3;3],Y[4;4],7[5;5],F[6;6 
]  ,  V  [  7  ;  7  ] 

[35]  +Llx i ( 100>I7<-II+1 ) 

[36]  ’ ***END ’ 


V 
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7.  Hatanaka(B)  Estimator 

y i  =  Y i B i  +  y  i  , -  i 7 i  +  ^ i 7 2  +  ui  =  Z i 6 i  +  u i 

The  augmented  reduced  form  of  the  system  is 
y  =  y.,11,  +  Y_2n2  +  x  n3  +  x.tII/,  +  v 

1 .  Estimate  all  the  structural  equations  by  IV,  and  use  the 
estimated  coefficients  to  form  a  consistent  estimate  of  11,, 
n2 ,  n3 ,  and  nu .  Then  calculate  the  fitted  values  of  Y  as 


y  =  y _  i n ,  +  Y_2n2  +  x  n3  +  x^ru 


A 


Where  Ilj's  are  formed  using  the  consistent  estimate  of  the 
structural  coefficients  obtained  in  the  first  step. 

2.  From  the  first  stage  IV  estimation,  calculate  the  fitted 
values  of  u,.  Then  obtain  a  estimate  of  the  autocorrelation 
coefficient  r. 


u,  =  yi  -  Z !  6 
r  =  CORC  formula 


3.  Denote 


Then  estimate  the  structural  coefficients  as 


Finally,  the  proposed  estimate  are 


A 


A 


r  i  \rn+  r  / 

The  variance-covariance  matrix  is 


, 


<  Vj 


239 


Var  6  ! 


A 

o  1 


1 (w,  '  Z  !  )  '  1 
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8.  Hatanaka(C)  Estimator 

Yi  =  +  Y.  ,7,  +  X,72  +  u  i 

1.  Obtain  W,  as  in  Hatanaka(B). 

2.  Also,  from  the  first  stage  IV  estimation  obtain 

—  A  A  A 

u,  =  u,  -  ru, ( -  1 

3.  Estimate  the  structural  coefficient  as 


Then  the  final  estimated  coefficient  are 


Where  5 iV  is  obtained  from  the  first  stage  IV  estimation. 

4.  The  variance-covariance  matrix  of  the  structural 
coefficient  is 


A  A 


1 


V  HATANAKABC 
[  1]  AHB+  0  7  pO 
[  2]  AHC+  0  7  pO 
[  3]  AVHB  +  0  7  pO 
[  4]  AVHC+  0  7  pO 
[  5]  11+ 1 

[  6]  II  : Y 1+Y  DAT  A[  (  3  1  x  {II -  1  )  )  +  i  3  1  ;  1] 

[  7]  Y2+YDATA[(31x(II-l) )+i31;2] 

[  8]  Y3+YDATAI (31x(jj-i) ) + i 3 1 ; 3 ] 

[  9]  Yll+liYl 

[10]  Y21+-14-Y2 

[11]  Y31^1lY3 

[12]  YL1-*-- 1 1 Y1 

[13]  Y  L2  +  ~ 1 1 Y2 

[14]  XI+YL 1,  1  0  AX1 

[15]  {/<-(  "14-Y11  )  ,  C11Y21)  ,  C11Y31)  ,  CllYH)  ,  ('14-YL2)  ,  (2  0  1Z 
),10l‘10l0"llZ 

[16]  Y1H+W+  .  x  ( l  +  Yll  )@J/ 

[17]  Y2H+W+ . x ( i*Y21 )©/ 

[18]  Y3H+W+  .  x  ( 1+13  1  )ESJ/ 

[19]  W1+(1AY2H)  ,  ( 1 4-Y  35)  ,  (  _14-Y1^)  ,  3  0  111 

[20]  W2+(1±Y1H) , 1+Y2H) ,  3  0  1Z2 

[21]  Z1«-(24'Y21)  ,  (24-Y31)  ,  ( 2 1 YL 1  )  ,  3  0  111 

[22]  Z2«-(2lYll)  ,  (24-YI2)  ,  3  0  112 

[23]  B1«-(ES(  (WD  +  .xZl)  )  +  .x(Wl)  +  .x2lYll 

[24  ]  B2«-(E(  (W2  )  +  .  xZ2  )  )  +  .  x  (W2  )  +  .  x2lY21 

[25]  B3+(  11Y31  )EE)(  Y2B ,  2  0  1Y3  ) 

[26]  BB  +  3  3  pi,  (  -  51  [  1  ]  )  ,  (  -  B1 [ 2 ] )  ,  (  -B2 [ 1 ]  )  ,1,0,0,  ( -53 [ 1 ] )  ,1 

[27]  Q1  +  1  7  p  £ 1 [ 4  ]  ,  0  ,  0  ,  S 1 [ 5  ]  ,  0  ,  0 , 5 1 [ 6 ] 

[28]  Q2  +  1  7  pO  ,£2[3]  ,  0  ,B2[4]  ,  0  ,S2[5] ,S2[6] 

[29]  Q3  +  1  7  p  0  ,  B  3  [  2  ]  ,  5  3 [ 3 ] , 0 , 5  3 [ 4 ] , 0 , S  3 [ 5 ] 

[30]  Q+Ql  ,  [1]  Q2  ,  [  1  ]  Q  3 

[31]  P+  3  3  p£l[3],0,0,0,£2[2],0,0,0,0 

[32]  R1+(11EM~  ItE+Yll  -  (Y21  ,Y31  ,  YI1  ,  1  0  Ul)  +  .x51 

[33]  R2  +  ( 1  i  E  )\B~  1  A  E+Y  2  1  -  (Yll  ,  YI2  ,  1  0  lY2)+.xfi2 

[34]  R3+( 11B)@"11B^Y31- (Y21 ,  1  0  lY3)+.x£3 

[35]  RR+  3  3  pBl, 0,0, 0,52, 0,0, 0,53 

[36]  Pl^(gB£)+.x(P+55+.x£B) 

[37]  P2+{W\BB)  +  .  x55+  .  xf 

[38]  P3*-(@££)  +  .  x£ 

[39]  P4^(@££ ) + . x55+ . xQ 

[40]  YY+§ ( 3  31  p  Y 1  ,  Y  2  ,  Y  3  ) 

[41]  Y  2  H+  (Pl[2;]+.x<$)(l  0  1  “1  0  1YY  )  )  -  (52  [2  ;  ]+  .  x$(  "2  0  1YY) 

)  +  (P3[2;]  +  .x<X>(2  0  +  Z))-(P4[2;]+.x$>(i  0  1  "1  0  1Z)) 

[42]  Y  3  H+  (P1[3;]  +  .X<$>(1  0  l  _1  0  4-YY)  )  -  ( P  2  [  3  ;  ]  +  .x<$)("2  0  1YY) 
) + (P3 [ 3 ; ] + . xfi?( 2  0  1Z ) ) - (P4 [ 3 ; ] + . x$( l  0  1  “1  0  lZ)) 

[43]  UH1+Y11- (Y21 ,Y31 ,XI)+ .x£i 

[44]  Y  2HR+Y  25-51x~14-Y21 

[45]  Y  3HR+Y 3P-51x"llY31 

[46]  Y 1 R+  (11Y11)  -  51x"14-Y11 

[47]  Y2R+{  14-Y21  )  -51x"liY21 

[48]  Y  3  R+  (11Y31)-51x“14-Y31 

[49]  XIR+(1  0  1 YI ) -  51 x  “I  0  AXI 

[50]  WH+Y2HR,Y3HR,XIR,~ 1+UH1 


■ 
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[51]  Z  R+-Y  2R ,Y  3  R , X I R ,  ~  HU  HI 

[52]  ( fyWH )  +  .x-ZR)  )  +  .  x  (  §WH )  +  .  x  y  l  R 

[53]  AHB+-AHB  ,[1]  BR+  (  0  ,  0  ,  0  ,  0  ,  0  ,  0  ,  Rl  ) 

[54]  URff+YIR-  (Y2R,Y3R,XIR)  +  .*'  HBR 

[55]  S2+( +/URH*2 )*24 

[56]  V+-S2x\3(WH)  +  .  xZR 

[57]  AVHB<-AVHB  ,  [  1  ]  7  [  1  ;  1  ]  ,  7  [  2  ;  2  ]  ,  7  [  3  ;  3  ]  ,  7  [  4  ;  4  ]  ,  7  [  5  ;  5  ]  ,  7  [  6  ;  6 
]  ,  7  [  7  ;  7  ] 

[58]  n  ****CALCULATION  OF  H  AT  AN AKA  C 

[59]  URHl^(liUHl)  -Rl*~  1UJH1 

[60]  BC+URHlEWH 

[61]  AHC+-AHC,  [  1  ]  BC+(BltRl) 

[62]  URH+Y1R- (Y2R,Y3R,XIR)+.x~liBC+{Bl,Rl) 

[63]  S2+-(  +  /URH*2)+2Ll 

[64]  V+S2x®(<WH)  +  .  xWH 

[65]  AVHC+AVHC, [1 ]  7[1;1],7[2;2],7[3;3],7[4;4],7[5;5],7[6;6 
]  ,  7  [  7  ;  7  ] 

[66]  ^Ilxi (100>JJ^IJ+1 ) 

[67]  '***END' 
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APPENDIX  III:  THE  RELATIVE  IMPORTANCE  OF  THE  FIRST 


OBSERVATION  IN  SIMULTANEOUS  EQUATION  MODELS 

Review  of  the  Findings  in  the  Single  Equation  Case 
Taylor(1981)  gave  analytical  reasons  for  the 
differences  that  exist  in  the  performance  of  a  single 
equation  estimator  in  the  presence  of  autocorrelation  when 
different  specifications  for  the  explanatory  variables  are 
used.  In  general  he  showed  that  the  specification  of  the 
process  that  generates  the  explanatory  variable  affects  the 
performance  of  different  estimators.  As  was  the  case  in  Rao 
and  Gr i 1 iches ( 1 969 ) ,  Mae shi ro ( 1 976 , 1 97 9 ) ,  and  Park  and 
Mitchell ( 1 980 ) ,  he  considered  the  following  model 

yt  =  a!  +  b]X,  +  ut  t=1,...,T  (1) 

u  t  =  rut-i  +  ft 

Following  Maeshiro,  Rao  and  Griliches  and  Park  and  Mitchell 
the  X  variable  followed  one  of  the  following  processes: 

1- Stationary  autoregressive  process  of  the  form 

Xt  =  XX t - T  +  Vt  (2) 

where 

Var(Xt)  =  plim  (1/T)(LXt2)  =  ox2  =  o2/( 1-A2) 

2-  Non- stochastic  autoregressive  process  of  the  form: 

Xt  =  XX t . i  t  =  2  ,  .  .  .  ,T 

X,  =  c, 

where  Var(Xt)  =  plim  (1/T)(IXt2)  =  ox2  =  0 

He  showed  that  the  ratio  of  the  variance  of  the  coefficient 
b  when  it  is  estimated  by  the  Cochrane-Orcut t  method  (b  ), 
which  omits  the  first  observation,  to  the  GLS  (which  retains 
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the  first  observation)  (bs)  is  equal  to 

A 

V(bc )  ( 1  - 2Xr  +  r  2 ) ( T- 1 ) a  x 2  +  ( 1  -  r  2 )  a  x 2  +c 1 2 ( 1  - r X ) 2  / ( 1 -X ) 

V( b  s )  (1-2Xr  +  r2)(T-1)ax2+c12(r-X)2/(1-X2) 

which  goes  to  one  as  T  approaches  infinity  if  ox2  is 

nonzero.  However,  if  ctx2  approaches  zero,  we  shall  have 

V(bc )  ^  (  1-rX) 2  (  1  -r 2  )  ( 1 -X 2 )  (3) 

V ( b  s )  (r-X)2  (r-X)2 

Which  tends  to  infinity  as  r  approaches  X. 

He  also  showed  that  if  crx2  =  0,  then 

V(b*)=a2/[(1-r2)X12+(1-r/X)2ZXt2]  (4) 

9  X 

which  suggest  that  as  r  tends  toward  X,  the  relative 
importance  of  the  first  observation  increases  and  when  r=X, 
the  only  relevant  observation  is  the  first  one. 

Transforming  model  (1)  so  that  it  is  free  of 
autocorrelation  and  then  estimating  the  resultant  model  with 
OLS  gives: 

/l  -  r  2  y  !  =  b  t  JT-  r  2  X ,  +  € , 

y  ,  -  ry  t  -  i  =  b^Xt-rX,.,)  +  et  t  =  2,...T 

if  all  observations  are  utilized; 
or 

y  =  b !  X  +  e  t 

This  transformation  does  not  necessarily  decrease  the 
variability  of  the  X  matrix.  Since 

V ( X , ) /V ( X  t )  =  1  -  2  r  X  +  r2  (5) 

the  above  transformation  increases  the  variance  of  the 
exogenous  variable  whenever  (  r>0  ,  r>2X)  or  (r<0,  r<2  X) 
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and  thus  increases  the  precision  of  the  associated  estimator 
of  bl4  If  X  is  non-stochastic  (ax2=0)  and  r  becomes  equal  to 
X,  we  shall  have 

/l  -  r  2  y  ,  =  b  yl-  r  2  X,  +  e, 
y t-ry t .  i  =  b, (Xt-rXt . , )  +  e  t 
or 

yi-r2  y ,  =  b:  J]-r 2  Xt  +  e, 

y t-ry t - i  =  0  +  e t 

which  means  that  as  a  result  of  the  above  transformation,  if 

X  is  non-stochastic  trended  (crx2  =  0  or  Vt=0)  the  only 

relevant  observation  is  the  first  one  and  the 

Cochrane-Orcut t  estimator  which  omits  the  first  observation, 

loses  all  relevant  information  about  the  data  and  this  is 

the  reason  why  its  variance  tends  to  infinity  (see  formula 

(3)).  However,  this  would  not  have  happened  if  X  process  was 

stochastic,  i.e.,  Xt  =  V,  ,  t=2,...,T. 

Therefore,  Taylor  concluded  that  (1981,  p.78): 

Thus  Park  and  Mitchell's  (and  Maeshiro's)  results 
unambiguously  apply  to  non-stochastic  trended 
variables  and  to  stochastic  processes  whose 
realization  happen  to  be  trended.  On  the  other  hand, 

Rao  and  Griliches  results  pertain  to  a  stochastic 
exogenous  process.... 


Simul tanous  Equation  Case 

The  following  points  may  help  to  explain  why  the 
presence  of  trend  in  the  explanatory  variables  affects  the 
performance  of  the  simultaneous  equation  estimators 
differently  from  that  of  the  single  equation  methods.  In 
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other  words,  why  the  estimators  that  employ  all  T 
observations  do  not  necessarily  perform  better  than  those 
which  utilize  T- 1  observations  when  some  of  the  explanatory 
variables  are  trended: 

1.  Unlike  the  single  equation  models,  simultaneous  equation 
models  include  some  endogenous  variables  among  their 
explanatory  variables.  This  means  that  even  if  all  the 
purely  exogenous  variables  are  trended  and 
non-stochastic,  the  explanatory  variables,  nevertheless 
contain  some  stochastic  elements. 

2.  Many  of  the  limited  information  methods  do  not  use 
transformed  X's  as  instruments  in  their  first  step. 
Therefore,  since  they  do  not  transform  the  X's,  they 
will  not  be  subject  to  any  possible  reduction  in  the 
variability  of  the  exogenous  variables  caused  by  the 
autoregressive  transformation.  Moreover,  some  of  the 
limited  information  estimators  (like  Fair)  use  the 
lagged  value  of  the  endogenous  variables  as  instruments 
in  the  first  step.  Thus  even  if  the  X's  are 
non-stochastic  and  trended,  the  Y's  are  not.  Hence,  the 
instruments  in  the  first  step  estimation  will  inevitably 
contain  stochastic  variables. 

3.  If  some  of  the  X's  are  non-stochastic  and  trended  and  it 
happens  that  their  trend  coefficients  are  close  to  the 
autocorrelation  coefficient  of  the  structural  equation 
under  consideration,  the  value  of  the  first  observation 
of  the  trended  X's  decreases  as  the  number  of  the  X's 
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that  are  not  trended  increases  or  the  number  for  which 
the  trend  coefficients  are  not  equal  to  r,  increases. 
This  means  that  even  those  estimators,  i.e.,  Theil  type, 
that  rely  on  the  transformed  Z's  will  perform  more  or 
less  the  same  whether  they  use  T  or  T- 1  observations  if 
the  number  of  X's  are  large.  This  explains  the  small 
difference  that  exist  between  Theil  and  M2SLS  when  the 
X's  are  trended.  This  can  also  explain  the  observed 
improvement  in  the  performance  of  the  Fair  estimator 
which  uses  non-t ransf ormed  X's  and  Y's  as  instrument 
relative  to  the  Theil  estimator  when  X3  and  X6  were 
trended . 

4.  It  has  to  be  noted  that  all  the  experiments  conducted  in 
the  single  equation  context  have  considered  only  one 
explanatory  variable.  It  would  seem  reasonable  to  expect 
that  as  the  number  of  explanatory  variable  increases 
(especially  if  some  of  them  are  not  trended  or  their, 
trend  coefficients  are  not  close  to  the  autocorrelation 
coefficient  of  that  equation)  the  value  of  the  first 
observation  of  the  trended  variable  decreases.  Hence, 
the  gap  between  the  Cochrane-Orcut t  and  GLS  estimators 
decreases . 

Therefore,  the  conclusion  reached  by  Taylor  that  "the 
case  in  which  the  first  observation  is  asymptotically 
important  requires  an  unusual  X  process;  for  typical 
economic  variables,  the  asymmetry  disappears  in  expectation" 
seems  to  be  valid  for  the  simultaneous  equation  system  as 
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well.  However,  in  empirical  work,  it  is  advisable  to  use 
estimators  which  employ  all  T  observations  as  our 
experiments  show. 

To  test  the  empirical  validity  of  the  proposition 
outlined  above  concerning  the  effect  of  having  more  than  one 
explanatory  variable  in  the  model  on  the  performance  of 
single  equation  estimators  we  conducted  the  following 
sampling  experiments: 

Experiment  A:  The  first  experiment  was  conducted  using 
a  model  similar  to  that  of  Maeshiro  and  Park  and  Mitchell, 
with  the  following  stochastic  characteristics: 

y  t  =  a  1  +  b,x1t  +  u  t 

u t  =  r  u, . i  +  £, 

x,t  =  X  x1it., 

e  t  IIN(  0,  0.025) 
r  =  X  =  0.6 
a  i  =  b i  =1 

100  samples  of  20  observations  were  created.  Then  the 
performance  of  OLS ,  CORC ,  and  GLS  were  examined.  Table  A3-1 
shows  the  relative  RMSE  of  these  estimators.  The  results 
clearly  support  the  findings  of  Maeshiro  and  Park  and 
Mitchell  suggesting  the  extreme  inefficiency  of  CORC 
relative  to  OLS  and  GLS. 

Experiment  B:  The  second  experiment  was  conducted 
employing  the  same  model  as  above  except  adding  two 
regressors  to  give 


TABLE  A3- 
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Y  t  Si  +  b  !  X  !  +  b  2  X  2  +  b  3  X  3  +  ut 

where 

^2  t  =  X  2  X  2  (  t  -  1  +  V  2  t 

^3  t  =  X  3  X  3  (  t  -  1  +  V3t 

V2  t  ~  IIN(0,  0.06) 

V3  t  ~  IIN(0,  0.09) 

X2  =  0.2  ,  X  3  =  0.3 

The  stochastic  regressor  was  added  on  the  basis  of  the 
argument  that  it  is  unrealistic  to  assume  all  the  regressors 
of  an  equation  are  non-stochastic. 

Table  A3-2  shows  the  relative  RMSE  of  OLS,  CORC ,  and 
GLS .  OLS  and  GLS  which  employ  all  T  observations  are  still 
performing  better  on  the  estimation  of  the  trended  variable 
X i ,  but  the  extreme  asymmetry  due  to  the  first  observation 
has  disappeared.  CORC  is  no  longer  inferior  to  OLS.  This 
result  shows  that  the  relative  asymmetry  of  the  first 
observation  of  the  non-stochastic  trended  variable  decreases 
as  the  number  of  stochastic  regressors  increases.  Our  result 
supports  findings  of  Rao  and  Griliches  that  correcting  for 
autocorrelation  results  in  a  gain  of  efficiency  especially 
when  r>0.3.  This  results  also  clarifies  our  findings  in  the 
simultaneous  equation  system  that  there  does  not  exist  a 
dramatic  difference  between  the  limited  information 
estimators  which  use  T  and  those  that  employ  T- 1 
observations  when  the  exogenous  variables  are  trended  or 
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APPENDIX  IV:  MONTE  CARLO  RESULTS  CONCERNING  THE 


AUTOREGRESSIVE  MODELS  WITHOUT  LAGGED  ENDOGENOUS  VARIABLES 

This  appendix  presents  the  detail  statistics  on  the  results 
of  the  Monte  Carlo  experiments  using  the  static  model 
presented  in  the  first  part  of  this  thesis.  The  structural 
equation  under  consideration  can  be  written  as 

y i  =  32  +  0.64  y2  +  0.22  y3  +  0.65  X,  +  0.52  X4  +  u. 

In  the  following  tables  we  have  used  normalized  (RMSE) , 
normalized  trace(MSE),  and  normalized  det(MSE)  indices.  To 
define  these  indices,  we  first  specify  the  following  indices 

1-  The  root  mean  square  error(RMSE)  of  the  jth  coefficient 
i  s 

RMSE  (  j  )  =  [L(/3  |  j  -  /3  j  )  2  /  N  ]  1  7  2 

2-  The  mean  square  error  matrix  is 

MSE  =  [L(/3  i  -  0)  (j8  i  ~  0)  '  ]  /  N 

3-  The  trace(MSE)  is 

Trace(MSE)  =  L  diag(MSE) 

1-t 

Then  we  can  define  the  normalized  indices  are  as  follows 
Normal i zed ( RMSE s )  =  RMSEs/  RMSE(Theil  True) 

Normalized  tr(MSEs)  =  tr(MSEs)  /  tr[MSE(Theil  True)] 
Normalized  det(MSE)  =  det(MSE$)  /  det [MSE (The i 1  True)] 
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TABLE  A4-1:  Estimate  of  Bias  For  the  Autoregressive  Static  Model 

( r  =0 . 2 ,  T  =  30) 
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In  addition  the  aggregate  bias  of  the  2SLS  estimator  is  on  the  basis 
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Note  that  the  aggregate  bias  index  does  not  include  the  intercept  term, 
in  addition  the  aggregate  bias  of  the  2SLS  estimator  is  on  the  basis 
of  the  four  estimated  structural  coefficients'  biases. 
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of  the  four  estimated  structural  coefficients'  biases. 
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Note  that  the  aggregate  bias  index  does  not  include  the  intercept  term. 
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TABLE  A4-7:  Normalized  Trace(RMSE)  Using  Autoregressive  Static  Model 

( r  =0 . 2 ,  T  =  30 ) 
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TABLE  A4-8 :  Normalized  Trace(RMSE)  Using  Autoregressive  Static  Model 

( r  =0 . 6 ,  T  =  30 ) 
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TABLE  A4-9:  Normalized  Trace(RMSE)  Using  Autoregressive  Static  Model 

( r  =0 . 9 ,  T  =  30) 
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TABLE  A4-10:  Normalized  Trace(RMSE)  Using  Autor egress i ve  Static  Model 

( r  =0 . 2 ,  T  =  60) 
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TABLE  A 4  —  1 1 :  Normalized  Trace(RMSE)  Using  Autoregressive  Static  Model 

( r =0 . 6 ,  T=60) 
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TABLE  A4-12:  Normalized  Trace(RMSE)  Using  Autoregressive  Static  Model 

( r  =0 . 9 ,  T  =  60) 
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APPENDIX  V:  MONTE  CARLO  RESULTS  OF  THE  DYNAMIC 

AUTOREGRESSIVE  MODELS 

This  appendix  presents  detail  statistics  of  the  result  of 
the  Monte  Carlo  experiments  using  the  dynamic  autoregressive 
model  outlined  in  chapter  six. 

The  structural  equation  under  consideration  can  be 
presented  as 

y  ]  =  32  +  0.64  y2  +  0.22  y3  +  0.25  yi,-i  +  0.65  X i  + 
0.52  X„  +  u. 
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TABLE  A5-1:  Estimated  Bias  on  the  Basis  of  the  Mean  Using  Dynamic  Autoregressive  Model 

( r  =0 . 2 ,  T  =  30) 
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TABLE  A5-2:  Estimated  Bias  on  the  Basis  of  the  Mean  Using  Dynamic  Autoregress  1 ve  Model 

( r =0 . 6 ,  T=30) 
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TABLE  A5-3:  Estimated  Bias  on  the  Basis  of  the  Mean  Using  Dynamic  Autoregressive  Model 

( r  =0 . 9 ,  T  =  30) 
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TABLE  A5-4:  Estimated  Bias  on  the  Basis  of  the  Mean  Using  Dynamic  Autoregressive  Model 

( r =0 . 2 ,  T=60) 


268 


00 

CO 

in 

ID 

’O' 

CO 

0) 

< 

CO 

CO 

o 

o 

O 

O 

o 

■*- 

_J 

z 

00 

CM 

o 

o 

i 

o 

1 

O 

i 

o 

1 

o 

Ifl 

d) 

h- 

h- 

^r 

CO 

o 

ID 

CO 

E 

T— 

T— 

T— 

T— 

cn 

O 

’O' 

> 

C_ 

CD 

o 

o 

o 

O 

O 

o 

O 

r  iii 

Q 


o 


ro 


0) 

r- 

CD 

CM 

CO 

O 

CO 

0j 

CO 

CM 

CM 

CM 

O 

CO 

T- 

c 

(0 

in 

O 

O 

O 

O 

O 

O 

o 

-H 

T— 

1 

1 

i 

i 

(0 

X 

DO 


<C 


LO 

CO 

^r 

o 

CO 

o 

CD 

(0 

^0 

T— 

CM 

O 

O) 

ID 

O 

c 

(0 

T— 

T- 

o 

o 

1- 

o 

CO 

-H 

CO 

1 

1 

1 

1 

ra 

x 


< 

(0 

CO 

CM 

0) 

n 

CM 

(0 

■*— 

T- 

T- 

-r- 

CM 

o 

’O' 

c 

ro 

CD 

o 

o 

o 

o 

O 

o 

O 

■H 

ro 

X 

1 

1 

1 

1 

CM 

r- 

t'- 

in 

r- 

OJ 

o 

•r- 

CD 

T- 

o 

T- 

T- 

CO 

V- 

’O 

ro 

u_ 

o 

o 

o 

o 

o 

o 

O 

i  i  i  i  i 


»— 

CO 

r- 

’O’ 

CO 

o 

CO 

•*- 

CD 

**— 

T- 

T— 

■*- 

ro 

O 

’O' 

0 

X 

CD 

O 

O 

o 

O 

O 

o 

O 

i—  iii 


c 

d) 

o 


0) 

o 

o 


* 

0) 

■H 

ro  w 

o  co  co  do  o  o  o  o>  ro 

Q)  - 

o  co 

D) 

O) 

< 


*Note  that  this  index  does  not  include  the  intercept  term. 
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TABLE  A5-7 :  Estimated  Bias  on  the  Basis  of  the  Median  Using  Dynamic  Autoregressive  Model 

( r  =0 . 2 ,  T  =  30) 
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TABLE  A5-8:  Estimated  Bias  on  the  Basis  of  the  Median  Using  Dynamic  Autoregressive  Model 

( r  =0 . 6 ,  T  =  30 ) 
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APPENDIX  VI:  MONTE  CARLO  RESULTS  OF  TESTS  OF  HYPOTHESES  FOR 


THE  STATIC  AUTOREGRESSIVE  MODEL 

This  appendix  presents  the  detail  statistics  concerning  the 
Type  I  error  and  the  power  of  the  test  of  significance  for 
the  autoregressive  model  without  lagged  endogenous 
variables . 
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TABLE  A6-5:  Percentage  of  Type  I  Error  at  5%  Significant  Level  Using  Static  Model 
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APPENDIX  VII:  MONTE  CARLO  RESULTS  OF  TESTS  OF  HYPOTHESES  FOR 


THE  DYNAMIC  AUTOREGRESSIVE  MODEL 

This  appendix  presents  the  detail  statistics  concerning  the 
Type  I  error  and  the  power  of  the  test  of  significance  for 
the  dynamic  autoregressive  model  studied  in  the  second  part 
of  this  thesis. 
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TABLE  A7-5:  Percentage  Type 
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TABLE  A7-6:  Percentage  Type 


308 


i/> 

— J  in  r--  co  cm  o  oo  oo 

oo  u>  y>  u>  id  •■a-  in  in 

CM 


d) 

3 


o 

tn 

2 

0 

E 

O 

> 

•»— 

£. 

£ 

3 

to 

Q 

c 

> 

Q 

at 

c 

C_> 

•r— 

'w' 

in 

03 

3 

03 

0 

C 

O 

CO 

c 

-H 

(0 

03 

o 

I 

•f—  > — «, 

M-  o 

(D 

C  II 

/ - v 

3)  l- 

< 

*1— 

00  - 

CO 

0) 

M- 

CO 

0  o 

c 

li 

CO 

£_ 

-H 

0)  W 

CO 

> 

I 

01 


in 


to  c 

£_  (0 

O  u. 

c. 

£_ 

LU 


O  c~  id  cm  O  id 

CM  CO  CO  CM  CM  CM 


co  co  r-  cn  co  c~  c- 

CM  CD  co  0)  cm  c~  r~ 


co  co  O  ^  in  C' 

ID  MT  CM  rf  CO  CO  CO 


OO  ^  CD  O  00  O  O 

CM  Ln  CM  in  CM  'tf 


i_ 

0) 


Cl 

0) 

u 


c 


0) 

sz 


Q) 

V 

3 

U 

c 


-F 

o 

c 

in 

0) 

o 

to 

X 

0 

3 

c 


in 

sz 


co  ^  in  co  cm  O  in 

0  CM  CO  ^  CD  CM  CM  CM 

SZ 

I— 


co 

r 

■H 

0 

■F 

o 

z 

* 


c 

0 

o 


0 

o 

V 


* 

0 

at 

ro 

CJ  03  CO  CO  CJ  o  c 

0 

> 

< 


TABLE  A7-7 :  Percentage  Power  of  Test  of  Significance  at  5%  Level  Using  Dynamic  Model 

( r  =0 . 2 ,  T  =  30) 
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TABLE  A7-8:  Percentage  Power  of  Test  of  Significance  at  5%  Level  Using  Dynamic  Model 

( r  =0 . 6 ,  T  =  30) 
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TABLE  A7-9:  Percentage  Power  of  Test  of  Significance  at  5%  Level  Using  Dynamic  Model 

( r  =0 . 9 ,  T  =  30 ) 
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*Note  that  this  index  does  not  include  the  intercept  term. 
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TABLE  A7-12:  Percentage  Power  of  Test  of  Significance  at  5%  Level  Using  Dynamic  Model 

( r =0 . 9 ,  T=60) 
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